[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1513305570.893.7.camel@gmx.de>
Date: Fri, 15 Dec 2017 03:39:30 +0100
From: Mike Galbraith <efault@....de>
To: Peter Zijlstra <peterz@...radead.org>,
Bart Van Assche <Bart.VanAssche@....com>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-block@...r.kernel.org" <linux-block@...r.kernel.org>,
"kernel-team@...com" <kernel-team@...com>,
"oleg@...hat.com" <oleg@...hat.com>, "hch@....de" <hch@....de>,
"axboe@...nel.dk" <axboe@...nel.dk>,
"jianchao.w.wang@...cle.com" <jianchao.w.wang@...cle.com>,
"osandov@...com" <osandov@...com>, "tj@...nel.org" <tj@...nel.org>
Subject: Re: [PATCH 2/6] blk-mq: replace timeout synchronization with a RCU
and generation based scheme
On Thu, 2017-12-14 at 22:54 +0100, Peter Zijlstra wrote:
> On Thu, Dec 14, 2017 at 09:42:48PM +0000, Bart Van Assche wrote:
>
> > Some time ago the block layer was changed to handle timeouts in thread context
> > instead of interrupt context. See also commit 287922eb0b18 ("block: defer
> > timeouts to a workqueue").
>
> That only makes it a little better:
>
> Task-A Worker
>
> write_seqcount_begin()
> blk_mq_rw_update_state(rq, IN_FLIGHT)
> blk_add_timer(rq)
> <timer>
> schedule_work()
> </timer>
> <context-switch to worker>
> read_seqcount_begin()
> while(seq & 1)
> cpu_relax();
>
>
> Now normally this isn't fatal because Worker will simply spin its entire
> time slice away and we'll eventually schedule our Task-A back in, which
> will complete the seqcount and things will work.
>
> But if, for some reason, our Worker was to have RT priority higher than
> our Task-A we'd be up some creek without no paddles.
Most kthreads, including kworkers, are very frequently SCHED_FIFO here.
-Mike
Powered by blists - more mailing lists