[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAD=FV=Vsk0SjkA+DbUwJxvO6NFcr0CO9=H1FD7okJ2PxMt5pYA@mail.gmail.com>
Date: Fri, 3 Apr 2020 08:49:54 -0700
From: Doug Anderson <dianders@...omium.org>
To: Ming Lei <ming.lei@...hat.com>
Cc: Jens Axboe <axboe@...nel.dk>,
"James E.J. Bottomley" <jejb@...ux.ibm.com>,
"Martin K. Petersen" <martin.petersen@...cle.com>,
Paolo Valente <paolo.valente@...aro.org>,
Salman Qazi <sqazi@...gle.com>,
linux-block <linux-block@...r.kernel.org>,
linux-scsi@...r.kernel.org, Guenter Roeck <groeck@...omium.org>,
Ajay Joshi <ajay.joshi@....com>, Arnd Bergmann <arnd@...db.de>,
Bart Van Assche <bvanassche@....org>,
Chaitanya Kulkarni <chaitanya.kulkarni@....com>,
Damien Le Moal <damien.lemoal@....com>,
Hou Tao <houtao1@...wei.com>,
Pavel Begunkov <asml.silence@...il.com>,
Tejun Heo <tj@...nel.org>, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 2/2] blk-mq: Rerun dispatching in the case of budget contention
Hi,
On Fri, Apr 3, 2020 at 8:10 AM Doug Anderson <dianders@...omium.org> wrote:
>
> Correct that it only happens with BFQ, but whether it's a BFQ bug or
> not just depends on how you define the has_work() API. If has_work()
> is allowed to be in-exact then it's either a blk-mq bug or a SCSI bug
> depending on how you cut it. If has_work() must be exact then it is
> certainly a BFQ bug. If has_work() doesn't need to be exact then it's
> not a BFQ bug. I believe that a sane API could be defined either way.
> Either has_work() can be defined as a lightweight hint to trigger
> heavier code or it can be defined as something exact. It's really up
> to blk-mq to say how they define it.
>
> From his response on the SCSI patch [1], it sounded like Jens was OK
> with has_work() being a lightweight hint as long as BFQ ensures that
> the queues run later. ...but, as my investigation found, I believe
> that BFQ _does_ try to ensure that the queue is run at a later time by
> calling blk_mq_run_hw_queues(). The irony is that due to the race
> we're talking about here, blk_mq_run_hw_queues() isn't guaranteed to
> be reliable if has_work() is inexact. :( One way to address this is
> to make blk_mq_run_hw_queues() reliable even if has_work() is inexact.
>
> ...so Jens: care to clarify how you'd like has_work() to be defined?
Sorry to reply so quickly after my prior reply, but I might have found
an extreme corner case where we can still run into the same race even
if has_work() is exact. This is all theoretical from code analysis.
Maybe you can poke a hole in my scenario or tell me it's so
implausible that we don't care, but it seems like it's theoretically
possible. For this example I'll assume a budget of 1 (AKA only one
thread can get budget for a given queue):
* Threads A and B both run has_work() at the same time with the same
"hctx". has_work() is exact but there's no lock, so it's OK if
Thread A and B both get back true.
* Thread B gets interrupted for a long time right after it decides
that there is work. Maybe its CPU gets an interrupt and the
interrupt handler is slow.
* Thread A runs, get budget, dispatches work.
* Thread A's work finishes and budget is released.
* Thread B finally runs again and gets budget.
* Since Thread A already took care of the work and no new work has
come in, Thread B will get NULL from dispatch_request(). I believe
this is specifically why dispatch_request() is allowed to return
NULL in the first place if has_work() must be exact.
* Thread B will now be holding the budget and is about to call
put_budget(), but hasn't called it yet.
* Thread B gets interrupted for a long time (again). Dang interrupts.
* Now Thread C (with a different "hctx" but the same queue) comes
along and runs blk_mq_do_dispatch_sched().
* Thread C won't do anything because it can't get budget.
* Finally Thread B will run again and put the budget without kicking
any queues.
Now we have a potential I/O stall because nobody will ever kick the
queues.
I think the above example could happen even on non-BFQ systems and I
think it would also be fixed by an approach like the one in my patch.
-Doug
Powered by blists - more mailing lists