[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180528133503.awomzj6djozbo5bv@quack2.suse.cz>
Date: Mon, 28 May 2018 15:35:03 +0200
From: Jan Kara <jack@...e.cz>
To: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Cc: syzbot <syzbot+4a7438e774b21ddd8eca@...kaller.appspotmail.com>,
syzkaller-bugs@...glegroups.com, jack@...e.cz,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
viro@...iv.linux.org.uk, axboe@...nel.dk, tj@...nel.org,
david@...morbit.com, linux-block@...r.kernel.org
Subject: Re: general protection fault in wb_workfn (2)
On Sun 27-05-18 09:47:54, Tetsuo Handa wrote:
> Forwarding http://lkml.kernel.org/r/201805251915.FGH64517.HVFJOOLFFMQStO@I-love.SAKURA.ne.jp .
>
> Jan Kara wrote:
> > > void delayed_work_timer_fn(struct timer_list *t)
> > > {
> > > struct delayed_work *dwork = from_timer(dwork, t, timer);
> > >
> > > /* should have been called from irqsafe timer with irq already off */
> > > __queue_work(dwork->cpu, dwork->wq, &dwork->work);
> > > }
> > >
> > > Then, wb_workfn() is after all scheduled even if we check for
> > > WB_registered bit, isn't it?
> >
> > It can be queued after WB_registered bit is cleared but it cannot be queued
> > after mod_delayed_work(bdi_wq, &wb->dwork, 0) has finished. That function
> > deletes the pending timer (the timer cannot be armed again because
> > WB_registered is cleared) and queues what should be the last round of
> > wb_workfn().
>
> mod_delayed_work() deletes the pending timer but does not wait for already
> invoked timer handler to complete because it is using del_timer() rather than
> del_timer_sync(). Then, what happens if __queue_work() is almost concurrently
> executed from two CPUs, one from mod_delayed_work(bdi_wq, &wb->dwork, 0) from
> wb_shutdown() path (which is called without spin_lock_bh(&wb->work_lock)) and
> the other from delayed_work_timer_fn() path (which is called without checking
> WB_registered bit under spin_lock_bh(&wb->work_lock)) ?
In this case, work should still be queued only once. The synchronization in
this case should be provided by the WORK_STRUCT_PENDING_BIT. When a delayed
work is queued by mod_delayed_work(), this bit is set, and gets cleared
only once the work is started on some CPU. But admittedly this code is
rather convoluted so I may be missing something.
Also you should note that flush_delayed_work() which follows
mod_delayed_work() in wb_shutdown() does del_timer_sync() so I don't see
how anything could get past that. In fact mod_delayed_work() is in
wb_shutdown() path to make sure wb_workfn() gets executed at least once
before the bdi_writeback structure gets cleaned up so that all queued items
are finished. We do not rely on it to remove pending timers or queued
wb_workfn() executions.
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists