[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <75970f9d-7e59-5fba-280a-d0d935fc4d2f@grimberg.me>
Date: Wed, 4 Nov 2020 11:15:27 -0800
From: Sagi Grimberg <sagi@...mberg.me>
To: Christoph Hellwig <hch@...radead.org>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: Jens Axboe <axboe@...nel.dk>, linux-block@...r.kernel.org,
Thomas Gleixner <tglx@...utronix.de>,
David Runge <dave@...epmap.de>, linux-rt-users@...r.kernel.org,
linux-kernel@...r.kernel.org,
Peter Zijlstra <peterz@...radead.org>,
Daniel Wagner <dwagner@...e.de>, Mike Galbraith <efault@....de>
Subject: Re: [PATCH 3/3] blk-mq: Use llist_head for blk_cpu_done
>>> There really aren't any rules for this, and it's perfectly legit to
>>> complete from process context. Maybe you're a kthread driven driver and
>>> that's how you handle completions. The block completion path has always
>>> been hard IRQ safe, but possible to call from anywhere.
>>
>> I'm not trying to put restrictions and forbidding completions from a
>> kthread. I'm trying to avoid the pointless softirq dance for no added
>> value. We could:
>
>> to not break that assumption you just mentioned and provide
>> |static inline void blk_mq_complete_request_local(struct request *rq)
>> |{
>> | rq->q->mq_ops->complete(rq);
>> |}
>>
>> so that completion issued from from process context (like those from
>> usb-storage) don't end up waking `ksoftird' (running at SCHED_OTHER)
>> completing the requests but rather performing it right away. The softirq
>> dance makes no sense here.
>
> Agreed. But I don't think your above blk_mq_complete_request_local
> is all that useful either as ->complete is defined by the caller,
> so we could just do a direct call. Basically we should just
> return false from blk_mq_complete_request_remote after updating
> the state when called from process context.
Agreed.
> But given that IIRC
> we are not supposed to check what state we are called from
> we'll need a helper just for updating the state instead and
> ensure the driver uses the right helper. Now of course we might
> have process context callers that still want to bounce to the
> submitting CPU, but in that case we should go directly to a
> workqueue or similar.
This would mean that it may be suboptimal for nvme-tcp to complete
requests in softirq context from the network context (determined by
NIC steering). Because in this case, this would trigger workqueue
schedule on a per-request basis rather than once per .data_ready
call like we do today. Is that correct?
It has been observed that completing commands in softirq context
(network determined cpu) because basically the completion does
IPI + local complete, not IPI + softirq or IPI + workqueue.
> Either way doing this properly will probabl involve an audit of all
> drivers, but I think that is worth it.
Agree.
Powered by blists - more mailing lists