lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 21 Jan 2023 12:28:14 +0900
From:   Byungchul Park <byungchul.park@....com>
To:     boqun.feng@...il.com
Cc:     linux-kernel@...r.kernel.org, torvalds@...ux-foundation.org,
        damien.lemoal@...nsource.wdc.com, linux-ide@...r.kernel.org,
        adilger.kernel@...ger.ca, linux-ext4@...r.kernel.org,
        mingo@...hat.com, peterz@...radead.org, will@...nel.org,
        tglx@...utronix.de, rostedt@...dmis.org, joel@...lfernandes.org,
        sashal@...nel.org, daniel.vetter@...ll.ch, duyuyang@...il.com,
        johannes.berg@...el.com, tj@...nel.org, tytso@....edu,
        willy@...radead.org, david@...morbit.com, amir73il@...il.com,
        gregkh@...uxfoundation.org, kernel-team@....com,
        linux-mm@...ck.org, akpm@...ux-foundation.org, mhocko@...nel.org,
        minchan@...nel.org, hannes@...xchg.org, vdavydov.dev@...il.com,
        sj@...nel.org, jglisse@...hat.com, dennis@...nel.org, cl@...ux.com,
        penberg@...nel.org, rientjes@...gle.com, vbabka@...e.cz,
        ngupta@...are.org, linux-block@...r.kernel.org,
        paolo.valente@...aro.org, josef@...icpanda.com,
        linux-fsdevel@...r.kernel.org, viro@...iv.linux.org.uk,
        jack@...e.cz, jlayton@...nel.org, dan.j.williams@...el.com,
        hch@...radead.org, djwong@...nel.org,
        dri-devel@...ts.freedesktop.org, rodrigosiqueiramelo@...il.com,
        melissa.srw@...il.com, hamohammed.sa@...il.com,
        42.hyeyoo@...il.com, chris.p.wilson@...el.com,
        gwan-gyeong.mun@...el.com, max.byungchul.park@...il.com,
        longman@...hat.com
Subject: Re: [PATCH RFC v7 00/23] DEPT(Dependency Tracker)

On Thu, Jan 19, 2023 at 07:07:59PM -0800, Boqun Feng wrote:
> On Thu, Jan 19, 2023 at 06:23:49PM -0800, Boqun Feng wrote:
> > On Fri, Jan 20, 2023 at 10:51:45AM +0900, Byungchul Park wrote:

[...]

> > > T0		T1		T2
> > > --		--		--
> > > unfair_read_lock(A);
> > >			write_lock(B);
> > >					write_lock(A);
> > > write_lock(B);
> > >			fair_read_lock(A);
> > > write_unlock(B);
> > > read_unlock(A);
> > >			read_unlock(A);
> > >			write_unlock(B);
> > >					write_unlock(A);
> > > 
> > > T0: read_unlock(A) cannot happen if write_lock(B) is stuck by a B owner
> > >     not doing either write_unlock(B) or read_unlock(B). In other words:
> > > 
> > >       1. read_unlock(A) happening depends on write_unlock(B) happening.
> > >       2. read_unlock(A) happening depends on read_unlock(B) happening.
> > > 
> > > T1: write_unlock(B) cannot happen if fair_read_lock(A) is stuck by a A
> > >     owner not doing either write_unlock(A) or read_unlock(A). In other
> > >     words:
> > > 
> > >       3. write_unlock(B) happening depends on write_unlock(A) happening.
> > >       4. write_unlock(B) happening depends on read_unlock(A) happening.
> > > 
> > > 1, 2, 3 and 4 give the following dependencies:
> > > 
> > >     1. read_unlock(A) -> write_unlock(B)
> > >     2. read_unlock(A) -> read_unlock(B)
> > >     3. write_unlock(B) -> write_unlock(A)
> > >     4. write_unlock(B) -> read_unlock(A)
> > > 
> > > With 1 and 4, there's a circular dependency so DEPT definitely report
> > > this as a problem.
> > > 
> > > REMIND: DEPT focuses on waits and events.
> > 
> > Do you have the test cases showing DEPT can detect this?
> > 
> 
> Just tried the following on your latest GitHub branch, I commented all
> but one deadlock case. Lockdep CAN detect it but DEPT CANNOT detect it.
> Feel free to double check.

I tried the 'queued read lock' test cases with DEPT on. I can see DEPT
detect and report it. But yeah.. it's too verbose now. It's because DEPT
is not aware of the test environment so it's just working hard to report
every case.

To make DEPT work with the selftest better, some works are needed. I
will work on it later or you please work on it.

The corresponding report is the following.

---

[    4.583997] ===================================================
[    4.585094] DEPT: Circular dependency has been detected.
[    4.585620] 6.0.0-00023-g331e0412f735 #2 Tainted: G        W         
[    4.586347] ---------------------------------------------------
[    4.586942] summary
[    4.587161] ---------------------------------------------------
[    4.587757] *** DEADLOCK ***
[    4.587757] 
[    4.588198] context A
[    4.588434]     [S] lock(&rwlock_A:0)
[    4.588804]     [W] lock(&rwlock_B:0)
[    4.589175]     [E] unlock(&rwlock_A:0)
[    4.589565] 
[    4.589727] context B
[    4.589963]     [S] lock(&rwlock_B:0)
[    4.590375]     [W] lock(&rwlock_A:0)
[    4.590749]     [E] unlock(&rwlock_B:0)
[    4.591136] 
[    4.591295] [S]: start of the event context
[    4.591716] [W]: the wait blocked
[    4.592049] [E]: the event not reachable
[    4.592443] ---------------------------------------------------
[    4.593037] context A's detail
[    4.593351] ---------------------------------------------------
[    4.593944] context A
[    4.594182]     [S] lock(&rwlock_A:0)
[    4.594577]     [W] lock(&rwlock_B:0)
[    4.594952]     [E] unlock(&rwlock_A:0)
[    4.595341] 
[    4.595501] [S] lock(&rwlock_A:0):
[    4.595848] [<ffffffff814eb244>] queued_read_lock_hardirq_ER_rE+0xf4/0x170
[    4.596547] stacktrace:
[    4.596797]       _raw_read_lock+0xcf/0x110
[    4.597215]       queued_read_lock_hardirq_ER_rE+0xf4/0x170
[    4.597766]       dotest+0x30/0x7bc
[    4.598118]       locking_selftest+0x2c6f/0x2ead
[    4.598602]       start_kernel+0x5aa/0x6d5
[    4.599017]       secondary_startup_64_no_verify+0xe0/0xeb
[    4.599562] 
[    4.599721] [W] lock(&rwlock_B:0):
[    4.600064] [<ffffffff814eb250>] queued_read_lock_hardirq_ER_rE+0x100/0x170
[    4.600823] stacktrace:
[    4.601075]       dept_wait+0x12c/0x1d0
[    4.601465]       _raw_write_lock+0xa0/0xd0
[    4.601892]       queued_read_lock_hardirq_ER_rE+0x100/0x170
[    4.602496]       dotest+0x30/0x7bc
[    4.602854]       locking_selftest+0x2c6f/0x2ead
[    4.603333]       start_kernel+0x5aa/0x6d5
[    4.603745]       secondary_startup_64_no_verify+0xe0/0xeb
[    4.604298] 
[    4.604458] [E] unlock(&rwlock_A:0):
[    4.604820] (N/A)
[    4.605023] ---------------------------------------------------
[    4.605617] context B's detail
[    4.605930] ---------------------------------------------------
[    4.606551] context B
[    4.606790]     [S] lock(&rwlock_B:0)
[    4.607163]     [W] lock(&rwlock_A:0)
[    4.607534]     [E] unlock(&rwlock_B:0)
[    4.607920] 
[    4.608080] [S] lock(&rwlock_B:0):
[    4.608427] [<ffffffff814eb3b4>] queued_read_lock_hardirq_RE_Er+0xf4/0x170
[    4.609113] stacktrace:
[    4.609366]       _raw_write_lock+0xc3/0xd0
[    4.609788]       queued_read_lock_hardirq_RE_Er+0xf4/0x170
[    4.610371]       dotest+0x30/0x7bc
[    4.610730]       locking_selftest+0x2c41/0x2ead
[    4.611195]       start_kernel+0x5aa/0x6d5
[    4.611615]       secondary_startup_64_no_verify+0xe0/0xeb
[    4.612164] 
[    4.612325] [W] lock(&rwlock_A:0):
[    4.612671] [<ffffffff814eb3c0>] queued_read_lock_hardirq_RE_Er+0x100/0x170
[    4.613369] stacktrace:
[    4.613622]       _raw_read_lock+0xac/0x110
[    4.614047]       queued_read_lock_hardirq_RE_Er+0x100/0x170
[    4.614652]       dotest+0x30/0x7bc
[    4.615007]       locking_selftest+0x2c41/0x2ead
[    4.615468]       start_kernel+0x5aa/0x6d5
[    4.615879]       secondary_startup_64_no_verify+0xe0/0xeb
[    4.616607] 
[    4.616769] [E] unlock(&rwlock_B:0):
[    4.617132] (N/A)
[    4.617336] ---------------------------------------------------
[    4.617927] information that might be helpful
[    4.618390] ---------------------------------------------------
[    4.618981] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W          6.0.0-00023-g331e0412f735 #2
[    4.619886] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[    4.620699] Call Trace:
[    4.620958]  <TASK>
[    4.621182]  dump_stack_lvl+0x5d/0x81
[    4.621561]  print_circle.cold+0x52b/0x545
[    4.621983]  ? print_circle+0xd0/0xd0
[    4.622385]  cb_check_dl+0x58/0x60
[    4.622737]  bfs+0xba/0x170
[    4.623029]  add_dep+0x85/0x170
[    4.623355]  ? from_pool+0x4c/0x160
[    4.623714]  __dept_wait+0x1fd/0x600
[    4.624081]  ? queued_read_lock_hardirq_ER_rE+0x100/0x170
[    4.624628]  ? rcu_read_lock_held_common+0x9/0x50
[    4.625108]  ? queued_read_lock_hardirq_ER_rE+0x100/0x170
[    4.625652]  dept_wait+0x12c/0x1d0
[    4.626000]  _raw_write_lock+0xa0/0xd0
[    4.626417]  queued_read_lock_hardirq_ER_rE+0x100/0x170
[    4.626951]  dotest+0x30/0x7bc
[    4.627270]  locking_selftest+0x2c6f/0x2ead
[    4.627702]  start_kernel+0x5aa/0x6d5
[    4.628081]  secondary_startup_64_no_verify+0xe0/0xeb
[    4.628597]  </TASK>
---

The most important part is the following.

[    4.588198] context A
[    4.588434]     [S] lock(&rwlock_A:0)
[    4.588804]     [W] lock(&rwlock_B:0)
[    4.589175]     [E] unlock(&rwlock_A:0)
[    4.589565] 
[    4.589727] context B
[    4.589963]     [S] lock(&rwlock_B:0)
[    4.590375]     [W] lock(&rwlock_A:0)
[    4.590749]     [E] unlock(&rwlock_B:0)

As I told you, DEPT treats a queued lock as a normal type lock, no
matter whether it's a read lock. That's why it prints just
'lock(&rwlock_A:0)' instead of 'read_lock(&rwlock_A:0)'. If needed, I'm
gonna change the format.

I checked the selftest code and found, LOCK(B) is transformed like:

	LOCK(B) -> WL(B) -> write_lock(&rwlock_B)

That's why '&rwlock_B' is printed instead of just 'B', JFYI.

Plus, for your information, you should turn on CONFIG_DEPT to use it.

	Byungchul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ