[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y8tfgYNZ//feEDvC@Boquns-Mac-mini.local>
Date: Fri, 20 Jan 2023 19:44:01 -0800
From: Boqun Feng <boqun.feng@...il.com>
To: Byungchul Park <byungchul.park@....com>
Cc: linux-kernel@...r.kernel.org, torvalds@...ux-foundation.org,
damien.lemoal@...nsource.wdc.com, linux-ide@...r.kernel.org,
adilger.kernel@...ger.ca, linux-ext4@...r.kernel.org,
mingo@...hat.com, peterz@...radead.org, will@...nel.org,
tglx@...utronix.de, rostedt@...dmis.org, joel@...lfernandes.org,
sashal@...nel.org, daniel.vetter@...ll.ch, duyuyang@...il.com,
johannes.berg@...el.com, tj@...nel.org, tytso@....edu,
willy@...radead.org, david@...morbit.com, amir73il@...il.com,
gregkh@...uxfoundation.org, kernel-team@....com,
linux-mm@...ck.org, akpm@...ux-foundation.org, mhocko@...nel.org,
minchan@...nel.org, hannes@...xchg.org, vdavydov.dev@...il.com,
sj@...nel.org, jglisse@...hat.com, dennis@...nel.org, cl@...ux.com,
penberg@...nel.org, rientjes@...gle.com, vbabka@...e.cz,
ngupta@...are.org, linux-block@...r.kernel.org,
paolo.valente@...aro.org, josef@...icpanda.com,
linux-fsdevel@...r.kernel.org, viro@...iv.linux.org.uk,
jack@...e.cz, jlayton@...nel.org, dan.j.williams@...el.com,
hch@...radead.org, djwong@...nel.org,
dri-devel@...ts.freedesktop.org, rodrigosiqueiramelo@...il.com,
melissa.srw@...il.com, hamohammed.sa@...il.com,
42.hyeyoo@...il.com, chris.p.wilson@...el.com,
gwan-gyeong.mun@...el.com, max.byungchul.park@...il.com,
longman@...hat.com
Subject: Re: [PATCH RFC v7 00/23] DEPT(Dependency Tracker)
On Sat, Jan 21, 2023 at 12:28:14PM +0900, Byungchul Park wrote:
> On Thu, Jan 19, 2023 at 07:07:59PM -0800, Boqun Feng wrote:
> > On Thu, Jan 19, 2023 at 06:23:49PM -0800, Boqun Feng wrote:
> > > On Fri, Jan 20, 2023 at 10:51:45AM +0900, Byungchul Park wrote:
>
> [...]
>
> > > > T0 T1 T2
> > > > -- -- --
> > > > unfair_read_lock(A);
> > > > write_lock(B);
> > > > write_lock(A);
> > > > write_lock(B);
> > > > fair_read_lock(A);
> > > > write_unlock(B);
> > > > read_unlock(A);
> > > > read_unlock(A);
> > > > write_unlock(B);
> > > > write_unlock(A);
> > > >
> > > > T0: read_unlock(A) cannot happen if write_lock(B) is stuck by a B owner
> > > > not doing either write_unlock(B) or read_unlock(B). In other words:
> > > >
> > > > 1. read_unlock(A) happening depends on write_unlock(B) happening.
> > > > 2. read_unlock(A) happening depends on read_unlock(B) happening.
> > > >
> > > > T1: write_unlock(B) cannot happen if fair_read_lock(A) is stuck by a A
> > > > owner not doing either write_unlock(A) or read_unlock(A). In other
> > > > words:
> > > >
> > > > 3. write_unlock(B) happening depends on write_unlock(A) happening.
> > > > 4. write_unlock(B) happening depends on read_unlock(A) happening.
> > > >
> > > > 1, 2, 3 and 4 give the following dependencies:
> > > >
> > > > 1. read_unlock(A) -> write_unlock(B)
> > > > 2. read_unlock(A) -> read_unlock(B)
> > > > 3. write_unlock(B) -> write_unlock(A)
> > > > 4. write_unlock(B) -> read_unlock(A)
> > > >
> > > > With 1 and 4, there's a circular dependency so DEPT definitely report
> > > > this as a problem.
> > > >
> > > > REMIND: DEPT focuses on waits and events.
> > >
> > > Do you have the test cases showing DEPT can detect this?
> > >
> >
> > Just tried the following on your latest GitHub branch, I commented all
> > but one deadlock case. Lockdep CAN detect it but DEPT CANNOT detect it.
> > Feel free to double check.
>
> I tried the 'queued read lock' test cases with DEPT on. I can see DEPT
> detect and report it. But yeah.. it's too verbose now. It's because DEPT
> is not aware of the test environment so it's just working hard to report
> every case.
>
> To make DEPT work with the selftest better, some works are needed. I
> will work on it later or you please work on it.
>
> The corresponding report is the following.
>
[...]
> [ 4.593037] context A's detail
> [ 4.593351] ---------------------------------------------------
> [ 4.593944] context A
> [ 4.594182] [S] lock(&rwlock_A:0)
> [ 4.594577] [W] lock(&rwlock_B:0)
> [ 4.594952] [E] unlock(&rwlock_A:0)
> [ 4.595341]
> [ 4.595501] [S] lock(&rwlock_A:0):
> [ 4.595848] [<ffffffff814eb244>] queued_read_lock_hardirq_ER_rE+0xf4/0x170
> [ 4.596547] stacktrace:
> [ 4.596797] _raw_read_lock+0xcf/0x110
> [ 4.597215] queued_read_lock_hardirq_ER_rE+0xf4/0x170
> [ 4.597766] dotest+0x30/0x7bc
> [ 4.598118] locking_selftest+0x2c6f/0x2ead
> [ 4.598602] start_kernel+0x5aa/0x6d5
> [ 4.599017] secondary_startup_64_no_verify+0xe0/0xeb
> [ 4.599562]
[...]
> [ 4.608427] [<ffffffff814eb3b4>] queued_read_lock_hardirq_RE_Er+0xf4/0x170
> [ 4.609113] stacktrace:
> [ 4.609366] _raw_write_lock+0xc3/0xd0
> [ 4.609788] queued_read_lock_hardirq_RE_Er+0xf4/0x170
> [ 4.610371] dotest+0x30/0x7bc
> [ 4.610730] locking_selftest+0x2c41/0x2ead
> [ 4.611195] start_kernel+0x5aa/0x6d5
> [ 4.611615] secondary_startup_64_no_verify+0xe0/0xeb
> [ 4.612164]
> [ 4.612325] [W] lock(&rwlock_A:0):
> [ 4.612671] [<ffffffff814eb3c0>] queued_read_lock_hardirq_RE_Er+0x100/0x170
> [ 4.613369] stacktrace:
> [ 4.613622] _raw_read_lock+0xac/0x110
> [ 4.614047] queued_read_lock_hardirq_RE_Er+0x100/0x170
> [ 4.614652] dotest+0x30/0x7bc
> [ 4.615007] locking_selftest+0x2c41/0x2ead
> [ 4.615468] start_kernel+0x5aa/0x6d5
> [ 4.615879] secondary_startup_64_no_verify+0xe0/0xeb
> [ 4.616607]
[...]
> As I told you, DEPT treats a queued lock as a normal type lock, no
> matter whether it's a read lock. That's why it prints just
> 'lock(&rwlock_A:0)' instead of 'read_lock(&rwlock_A:0)'. If needed, I'm
> gonna change the format.
>
> I checked the selftest code and found, LOCK(B) is transformed like:
>
> LOCK(B) -> WL(B) -> write_lock(&rwlock_B)
>
> That's why '&rwlock_B' is printed instead of just 'B', JFYI.
>
Nah, you output shows that you've run at least both function
queued_read_lock_hardirq_RE_Er()
queued_read_lock_hardirq_ER_rE()
but if you apply my diff
https://lore.kernel.org/lkml/Y8oFj9A19cw3enHB@boqun-archlinux/
you should only run
queued_read_lock_hardirq_RE_Er()
one test.
One of the reason that DEPT "detect" this is that DEPT doesn't reset
between tests, so old dependencies from previous run get carried over.
> Plus, for your information, you should turn on CONFIG_DEPT to use it.
>
Yes I turn that config on.
Regards,
Boqun
> Byungchul
Powered by blists - more mailing lists