[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201116153729.GC29991@casper.infradead.org>
Date: Mon, 16 Nov 2020 15:37:29 +0000
From: Matthew Wilcox <willy@...radead.org>
To: Byungchul Park <byungchul.park@....com>
Cc: Steven Rostedt <rostedt@...dmis.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...nel.org>, torvalds@...ux-foundation.org,
peterz@...radead.org, mingo@...hat.com, will@...nel.org,
linux-kernel@...r.kernel.org, joel@...lfernandes.org,
alexander.levin@...rosoft.com, daniel.vetter@...ll.ch,
chris@...is-wilson.co.uk, duyuyang@...il.com,
johannes.berg@...el.com, tj@...nel.org, tytso@....edu,
david@...morbit.com, amir73il@...il.com, bfields@...ldses.org,
gregkh@...uxfoundation.org, kernel-team@....com
Subject: Re: [RFC] Are you good with Lockdep?
On Mon, Nov 16, 2020 at 05:57:57PM +0900, Byungchul Park wrote:
> On Thu, Nov 12, 2020 at 02:52:51PM +0000, Matthew Wilcox wrote:
> > On Thu, Nov 12, 2020 at 09:26:12AM -0500, Steven Rostedt wrote:
> > > > FYI, roughly Lockdep is doing:
> > > >
> > > > 1. Dependency check
> > > > 2. Lock usage correctness check (including RCU)
> > > > 3. IRQ related usage correctness check with IRQFLAGS
> > > >
> > > > 2 and 3 should be there forever which is subtle and have gotten matured.
> > > > But 1 is not. I've been talking about 1. But again, it's not about
> > > > replacing it right away but having both for a while. I'm gonna try my
> > > > best to make it better.
> > >
> > > And I believe lockdep does handle 1. Perhaps show some tangible use case
> > > that you want to cover that you do not believe that lockdep can handle. If
> > > lockdep cannot handle it, it will show us where lockdep is lacking. If it
> > > can handle it, it will educate you on other ways that lockdep can be
> > > helpful in your development ;-)
> >
> > Something I believe lockdep is missing is a way to annotate "This lock
> > will be released by a softirq". If we had lockdep for lock_page(), this
> > would be a great case to show off. The filesystem locks the page, then
> > submits it to a device driver. On completion, the filesystem's bio
> > completion handler will be called in softirq context and unlock the page.
> >
> > So if the filesystem has another lock which is acquired by the completion
> > handler. we could get an ABBA deadlock that lockdep would be unable to see.
> >
> > There are other similar things; if you look at the remaining semaphore
> > users in the kernel, you'll see the general pattern is that they're
> > acquired in process context and then released in interrupt context.
> > If we had a way to transfer ownership of the semaphore to a generic
> > "interrupt context", they could become mutexes and lockdep could check
> > that nothing else will cause a deadlock.
>
> Yes. Those are exactly what Cross-release feature solves. Those problems
> can be achieved with Cross-release. But even with Cross-release, we
> still cannot solve the problem of (1) readlock handling (2) and false
> positives preventing further reporting.
It's not just about lockdep for semaphores. Mutexes will spin if the
current owner is still running, so to convert an interrupt-released
semaphore to a mutex, we need a way to mark the mutex as being released
by the new owner.
I really don't think you want to report subsequent lockdep splats.
Powered by blists - more mailing lists