[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251013052354.GA75512@system.software.com>
Date: Mon, 13 Oct 2025 14:23:54 +0900
From: Byungchul Park <byungchul@...com>
To: NeilBrown <neil@...wn.name>
Cc: linux-kernel@...r.kernel.org, kernel_team@...ynix.com,
torvalds@...ux-foundation.org, damien.lemoal@...nsource.wdc.com,
linux-ide@...r.kernel.org, adilger.kernel@...ger.ca,
linux-ext4@...r.kernel.org, mingo@...hat.com, peterz@...radead.org,
will@...nel.org, tglx@...utronix.de, rostedt@...dmis.org,
joel@...lfernandes.org, sashal@...nel.org, daniel.vetter@...ll.ch,
duyuyang@...il.com, johannes.berg@...el.com, tj@...nel.org,
tytso@....edu, willy@...radead.org, david@...morbit.com,
amir73il@...il.com, gregkh@...uxfoundation.org, kernel-team@....com,
linux-mm@...ck.org, akpm@...ux-foundation.org, mhocko@...nel.org,
minchan@...nel.org, hannes@...xchg.org, vdavydov.dev@...il.com,
sj@...nel.org, jglisse@...hat.com, dennis@...nel.org, cl@...ux.com,
penberg@...nel.org, rientjes@...gle.com, vbabka@...e.cz,
ngupta@...are.org, linux-block@...r.kernel.org,
josef@...icpanda.com, linux-fsdevel@...r.kernel.org, jack@...e.cz,
jlayton@...nel.org, dan.j.williams@...el.com, hch@...radead.org,
djwong@...nel.org, dri-devel@...ts.freedesktop.org,
rodrigosiqueiramelo@...il.com, melissa.srw@...il.com,
hamohammed.sa@...il.com, harry.yoo@...cle.com,
chris.p.wilson@...el.com, gwan-gyeong.mun@...el.com,
max.byungchul.park@...il.com, boqun.feng@...il.com,
longman@...hat.com, yunseong.kim@...csson.com, ysk@...lloc.com,
yeoreum.yun@....com, netdev@...r.kernel.org,
matthew.brost@...el.com, her0gyugyu@...il.com, corbet@....net,
catalin.marinas@....com, bp@...en8.de, dave.hansen@...ux.intel.com,
x86@...nel.org, hpa@...or.com, luto@...nel.org,
sumit.semwal@...aro.org, gustavo@...ovan.org,
christian.koenig@....com, andi.shyti@...nel.org, arnd@...db.de,
lorenzo.stoakes@...cle.com, Liam.Howlett@...cle.com,
rppt@...nel.org, surenb@...gle.com, mcgrof@...nel.org,
petr.pavlu@...e.com, da.gomez@...nel.org, samitolvanen@...gle.com,
paulmck@...nel.org, frederic@...nel.org, neeraj.upadhyay@...nel.org,
joelagnelf@...dia.com, josh@...htriplett.org, urezki@...il.com,
mathieu.desnoyers@...icios.com, jiangshanlai@...il.com,
qiang.zhang@...ux.dev, juri.lelli@...hat.com,
vincent.guittot@...aro.org, dietmar.eggemann@....com,
bsegall@...gle.com, mgorman@...e.de, vschneid@...hat.com,
chuck.lever@...cle.com, okorniev@...hat.com, Dai.Ngo@...cle.com,
tom@...pey.com, trondmy@...nel.org, anna@...nel.org,
kees@...nel.org, bigeasy@...utronix.de, clrkwllms@...nel.org,
mark.rutland@....com, ada.coupriediaz@....com,
kristina.martsenko@....com, wangkefeng.wang@...wei.com,
broonie@...nel.org, kevin.brodsky@....com, dwmw@...zon.co.uk,
shakeel.butt@...ux.dev, ast@...nel.org, ziy@...dia.com,
yuzhao@...gle.com, baolin.wang@...ux.alibaba.com,
usamaarif642@...il.com, joel.granados@...nel.org,
richard.weiyang@...il.com, geert+renesas@...der.be,
tim.c.chen@...ux.intel.com, linux@...blig.org,
alexander.shishkin@...ux.intel.com, lillian@...r-ark.net,
chenhuacai@...nel.org, francesco@...la.it,
guoweikang.kernel@...il.com, link@...o.com, jpoimboe@...nel.org,
masahiroy@...nel.org, brauner@...nel.org,
thomas.weissschuh@...utronix.de, oleg@...hat.com, mjguzik@...il.com,
andrii@...nel.org, wangfushuai@...du.com, linux-doc@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, linux-media@...r.kernel.org,
linaro-mm-sig@...ts.linaro.org, linux-i2c@...r.kernel.org,
linux-arch@...r.kernel.org, linux-modules@...r.kernel.org,
rcu@...r.kernel.org, linux-nfs@...r.kernel.org,
linux-rt-devel@...ts.linux.dev
Subject: Re: [PATCH v17 28/47] dept: add documentation for dept
On Fri, Oct 03, 2025 at 04:55:14PM +1000, NeilBrown wrote:
> On Thu, 02 Oct 2025, Byungchul Park wrote:
> > This document describes the concept and APIs of dept.
> >
>
> Thanks for the documentation. I've been trying to understand it.
You're welcome. Feel free to ask me if you have any questions.
> > +How DEPT works
> > +--------------
> > +
> > +Let's take a look how DEPT works with the 1st example in the section
> > +'Limitation of lockdep'.
> > +
> > + context X context Y context Z
> > +
> > + mutex_lock A
> > + folio_lock B
> > + folio_lock B <- DEADLOCK
> > + mutex_lock A <- DEADLOCK
> > + folio_unlock B
> > + folio_unlock B
> > + mutex_unlock A
> > + mutex_unlock A
> > +
> > +Adding comments to describe DEPT's view in terms of wait and event:
> > +
> > + context X context Y context Z
> > +
> > + mutex_lock A
> > + /* wait for A */
> > + folio_lock B
> > + /* wait for A */
> > + /* start event A context */
> > +
> > + folio_lock B
> > + /* wait for B */ <- DEADLOCK
> > + /* start event B context */
> > +
> > + mutex_lock A
> > + /* wait for A */ <- DEADLOCK
> > + /* start event A context */
> > +
> > + folio_unlock B
> > + /* event B */
> > + folio_unlock B
> > + /* event B */
> > +
> > + mutex_unlock A
> > + /* event A */
> > + mutex_unlock A
> > + /* event A */
> > +
>
> I can't see the value of the above section.
> The first section with no comments is useful as it is easy to see the
> deadlock being investigate. The section below is useful as it add
> comments to explain how DEPT sees the situation. But the above section,
> with some but not all of the comments, does seem (to me) to add anything
> useful.
I just wanted to convert 'locking terms' to 'wait and event terms' by
one step. However, I can remove the section you pointed out that you
thought was useless.
> > +Adding more supplementary comments to describe DEPT's view in detail:
> > +
> > + context X context Y context Z
> > +
> > + mutex_lock A
> > + /* might wait for A */
> > + /* start to take into account event A's context */
>
> What do you mean precisely by "context".
That means one of task context, irq context, wq worker context (even
though it can also be considered as task context), or something.
Of course, in the example above, it must be task context since it showed
a case involving only sleepible ones. However, I wanted to use general
terms in the document to cover all the types of context e.g. irq, task,
and so on.
> You use the word in the heading "context X context Y context Z"
> so it seems like "context" means "task" or "process". But then as I
> read on, I think - maybe it means something else. If it does, then you
> should use different words. Maybe "task X ..." in the heading.
It should cover all the types of context. What word would you recommend
for that purpose?
> If the examples that follow It seems that the "context" for event A
> starts at "mutex lock A" when it (possibly) waits for a mutex and ends
> at "mutex unlock A" - which are both in the same process. Clearly
> various other events that happen between these two points in the same
> process could be seen as the "context" for event A.
>
> However event B starts in "context X" with "folio_lock B" and ends in
> "context Z" or "context Y" with "folio_unlock B". Is that right?
Right.
> My question then is: how do you decide which, of all the event in all
> the processes in all the system, between the start[S] and the end[E] are
> considered to be part of the "context" of event A.
DEPT can identify the "context" of event A only *once* the event A is
actually executed, and builds dependencies between the event and the
recorded waits in the "context" of event A since [S].
> I think it would help me if you defined what a "context" is earlier.
Sorry if my description was not clear enough.
Byungchul
> What sorts of things appear in a context?
>
> Thanks,
> NeilBrown
>
>
> > + /* 1 */
> > + folio_lock B
> > + /* might wait for B */
> > + /* start to take into account event B's context */
> > + /* 2 */
> > +
> > + folio_lock B
> > + /* might wait for B */ <- DEADLOCK
> > + /* start to take into account event B's context */
> > + /* 3 */
> > +
> > + mutex_lock A
> > + /* might wait for A */ <- DEADLOCK
> > + /* start to take into account
> > + event A's context */
> > + /* 4 */
> > +
> > + folio_unlock B
> > + /* event B that's been valid since 2 */
> > + folio_unlock B
> > + /* event B that's been valid since 3 */
> > +
> > + mutex_unlock A
> > + /* event A that's been valid since 1 */
> > +
> > + mutex_unlock A
> > + /* event A that's been valid since 4 */
> > +
> > +Let's build up dependency graph with this example. Firstly, context X:
> > +
> > + context X
> > +
> > + folio_lock B
> > + /* might wait for B */
> > + /* start to take into account event B's context */
> > + /* 2 */
> > +
> > +There are no events to create dependency. Next, context Y:
> > +
> > + context Y
> > +
> > + mutex_lock A
> > + /* might wait for A */
> > + /* start to take into account event A's context */
> > + /* 1 */
> > +
> > + folio_lock B
> > + /* might wait for B */
> > + /* start to take into account event B's context */
> > + /* 3 */
> > +
> > + folio_unlock B
> > + /* event B that's been valid since 3 */
> > +
> > + mutex_unlock A
> > + /* event A that's been valid since 1 */
> > +
> > +There are two events. For event B, folio_unlock B, since there are no
> > +waits between 3 and the event, event B does not create dependency. For
> > +event A, there is a wait, folio_lock B, between 1 and the event. Which
> > +means event A cannot be triggered if event B does not wake up the wait.
> > +Therefore, we can say event A depends on event B, say, 'A -> B'. The
> > +graph will look like after adding the dependency:
> > +
> > + A -> B
> > +
> > + where 'A -> B' means that event A depends on event B.
> > +
> > +Lastly, context Z:
> > +
> > + context Z
> > +
> > + mutex_lock A
> > + /* might wait for A */
> > + /* start to take into account event A's context */
> > + /* 4 */
> > +
> > + folio_unlock B
> > + /* event B that's been valid since 2 */
> > +
> > + mutex_unlock A
> > + /* event A that's been valid since 4 */
> > +
> > +There are also two events. For event B, folio_unlock B, there is a
> > +wait, mutex_lock A, between 2 and the event - remind 2 is at a very
> > +start and before the wait in timeline. Which means event B cannot be
> > +triggered if event A does not wake up the wait. Therefore, we can say
> > +event B depends on event A, say, 'B -> A'. The graph will look like
> > +after adding the dependency:
> > +
> > + -> A -> B -
> > + / \
> > + \ /
> > + -----------
> > +
> > + where 'A -> B' means that event A depends on event B.
> > +
> > +A new loop has been created. So DEPT can report it as a deadlock. For
> > +event A, mutex_unlock A, since there are no waits between 4 and the
> > +event, event A does not create dependency. That's it.
> > +
> > +CONCLUSION
> > +
> > +DEPT works well with any general synchronization mechanisms by focusing
> > +on wait, event and its context.
> > +
Powered by blists - more mailing lists