[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170829155205.GA17290@redhat.com>
Date: Tue, 29 Aug 2017 17:52:05 +0200
From: Oleg Nesterov <oleg@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Byungchul Park <byungchul.park@....com>, mingo@...nel.org,
linux-kernel@...r.kernel.org, kernel-team@....com,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Dave Chinner <david@...morbit.com>, Tejun Heo <tj@...nel.org>,
johannes@...solutions.net
Subject: Re: [PATCH v3 1/3] lockdep: Make LOCKDEP_CROSSRELEASE configs all
part of PROVE_LOCKING
Peter, sorry for delay, didn't have a chance to return to this discussion...
On 08/23, Peter Zijlstra wrote:
>
> > > It was added by Oleg in commit:
> > >
> > > a67da70dc095 ("workqueues: lockdep annotations for flush_work()")
> >
> > No, these annotations were moved later into start_flush, iiuc...
> >
> > This
> >
> > lock_map_acquire(&work->lockdep_map);
> > lock_map_release(&work->lockdep_map);
> >
> > was added by another commit 0976dfc1d0cd80a4e9dfaf87bd8744612bde475a
> > "workqueue: Catch more locking problems with flush_work()", and at
> > first glance it is fine.
>
> Those are fine and are indeed the flush_work() vs work inversion.
>
> The two straight forward annotations are:
>
> flush_work(work) process_one_work(wq, work)
> A(work) A(work)
> R(work) work->func(work);
> R(work)
>
> Which catches:
>
> Task-1: work:
>
> mutex_lock(&A); mutex_lock(&A);
> flush_work(work);
Yes, yes, this is clear.
But if we ignore the multithreaded workqueues, in this particular case
we could rely on A(wq)/R(wq) in start_flush() and process_one_work().
The problem is that start_flush_work() does not do acquire/release
unconditionally, it does this only if it is going to wait, and I am not
sure this is right...
Plus process_one_work() does lock_map_acquire_read(), I don't really
understand this too.
> And the analogous:
>
> flush_workqueue(wq) process_one_work(wq, work)
> A(wq) A(wq)
> R(wq) work->func(work);
> R(wq)
>
>
> The thing I puzzled over was flush_work() (really start_flush_work())
> doing:
>
> if (pwq->wq->saved_max_active == 1 || pwq->wq->rescuer)
> lock_map_acquire(&pwq->wq->lockdep_map);
> else
> lock_map_acquire_read(&pwq->wq->lockdep_map);
> lock_map_release(&pwq->wq->lockdep_map);
>
> Why does flush_work() care about the wq->lockdep_map?
>
> The answer is because, for single-threaded workqueues, doing
> flush_work() from a work is a potential deadlock:
Yes, but the simple answer is that flush_work() doesn't really differ
from flush_workqueue() in this respect?
If nothing else, if some WORK is the last queued work on WQ, then
flush_work(WORK) is the same thing as flush_workqueuw(WQ), more or less.
Again, I am talking about single-threaded workqueues.
> workqueue-thread:
>
> work-n:
> flush_work(work-n+1);
>
> work-n+1:
>
>
> Will not be going anywhere fast..
Or another example,
lock(LOCK);
flush_work(WORK);
unlock(LOCK);
workqueue-thread:
another_pending_work:
LOCK(LOCK);
UNLOCK(LOCK);
WORK:
In this case we do not care about WORK->lockdep_map, but
taking the wq->lockdep_map from flush_work() (if single-threaded) allows
to report the deadlock.
Again, this is just like flush_workqueue().
Oleg.
Powered by blists - more mailing lists