linux-kernel - Re: Report 2 in ext4 and journal based on v5.17-rc1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220223144859.na2gjgl5efgw5zhn@quack3.lan>
Date:   Wed, 23 Feb 2022 15:48:59 +0100
From:   Jan Kara <jack@...e.cz>
To:     Byungchul Park <byungchul.park@....com>
Cc:     Jan Kara <jack@...e.cz>, torvalds@...ux-foundation.org,
        damien.lemoal@...nsource.wdc.com, linux-ide@...r.kernel.org,
        adilger.kernel@...ger.ca, linux-ext4@...r.kernel.org,
        mingo@...hat.com, linux-kernel@...r.kernel.org,
        peterz@...radead.org, will@...nel.org, tglx@...utronix.de,
        rostedt@...dmis.org, joel@...lfernandes.org, sashal@...nel.org,
        daniel.vetter@...ll.ch, chris@...is-wilson.co.uk,
        duyuyang@...il.com, johannes.berg@...el.com, tj@...nel.org,
        tytso@....edu, willy@...radead.org, david@...morbit.com,
        amir73il@...il.com, bfields@...ldses.org,
        gregkh@...uxfoundation.org, kernel-team@....com,
        linux-mm@...ck.org, akpm@...ux-foundation.org, mhocko@...nel.org,
        minchan@...nel.org, hannes@...xchg.org, vdavydov.dev@...il.com,
        sj@...nel.org, jglisse@...hat.com, dennis@...nel.org, cl@...ux.com,
        penberg@...nel.org, rientjes@...gle.com, vbabka@...e.cz,
        ngupta@...are.org, linux-block@...r.kernel.org, axboe@...nel.dk,
        paolo.valente@...aro.org, josef@...icpanda.com,
        linux-fsdevel@...r.kernel.org, viro@...iv.linux.org.uk,
        jack@...e.com, jlayton@...nel.org, dan.j.williams@...el.com,
        hch@...radead.org, djwong@...nel.org,
        dri-devel@...ts.freedesktop.org, airlied@...ux.ie,
        rodrigosiqueiramelo@...il.com, melissa.srw@...il.com,
        hamohammed.sa@...il.com
Subject: Re: Report 2 in ext4 and journal based on v5.17-rc1

On Wed 23-02-22 09:35:34, Byungchul Park wrote:
> On Mon, Feb 21, 2022 at 08:02:04PM +0100, Jan Kara wrote:
> > On Thu 17-02-22 20:10:04, Byungchul Park wrote:
> > > [    9.008161] ===================================================
> > > [    9.008163] DEPT: Circular dependency has been detected.
> > > [    9.008164] 5.17.0-rc1-00015-gb94f67143867-dirty #2 Tainted: G        W
> > > [    9.008166] ---------------------------------------------------
> > > [    9.008167] summary
> > > [    9.008167] ---------------------------------------------------
> > > [    9.008168] *** DEADLOCK ***
> > > [    9.008168]
> > > [    9.008168] context A
> > > [    9.008169]     [S] (unknown)(&(&journal->j_wait_transaction_locked)->dmap:0)
> > > [    9.008171]     [W] wait(&(&journal->j_wait_commit)->dmap:0)
> > > [    9.008172]     [E] event(&(&journal->j_wait_transaction_locked)->dmap:0)
> > > [    9.008173]
> > > [    9.008173] context B
> > > [    9.008174]     [S] down_write(mapping.invalidate_lock:0)
> > > [    9.008175]     [W] wait(&(&journal->j_wait_transaction_locked)->dmap:0)
> > > [    9.008176]     [E] up_write(mapping.invalidate_lock:0)
> > > [    9.008177]
> > > [    9.008178] context C
> > > [    9.008179]     [S] (unknown)(&(&journal->j_wait_commit)->dmap:0)
> > > [    9.008180]     [W] down_write(mapping.invalidate_lock:0)
> > > [    9.008181]     [E] event(&(&journal->j_wait_commit)->dmap:0)
> > > [    9.008181]
> > > [    9.008182] [S]: start of the event context
> > > [    9.008183] [W]: the wait blocked
> > > [    9.008183] [E]: the event not reachable
> > 
> > So what situation is your tool complaining about here? Can you perhaps show
> > it here in more common visualization like:
> 
> Sure.
> 
> > TASK1				TASK2
> > 				does foo, grabs Z
> > does X, grabs lock Y
> > blocks on Z
> > 				blocks on Y
> > 
> > or something like that? Because I was not able to decipher this from the
> > report even after trying for some time...
> 
> KJOURNALD2(kthread)	TASK1(ksys_write)	TASK2(ksys_write)
> 
> wait A
> --- stuck
> 			wait B
> 			--- stuck
> 						wait C
> 						--- stuck
> 
> wake up B		wake up C		wake up A
> 
> where:
> A is a wait_queue, j_wait_commit
> B is a wait_queue, j_wait_transaction_locked
> C is a rwsem, mapping.invalidate_lock

I see. But a situation like this is not necessarily a guarantee of a
deadlock, is it? I mean there can be task D that will eventually call say
'wake up B' and unblock everything and this is how things were designed to
work? Multiple sources of wakeups are quite common I'd say... What does
Dept do to prevent false reports in cases like this?

> The above is the simplest form. And it's worth noting that Dept focuses
> on wait and event itself rather than grabing and releasing things like
> lock. The following is the more descriptive form of it.
> 
> KJOURNALD2(kthread)	TASK1(ksys_write)	TASK2(ksys_write)
> 
> wait @j_wait_commit
> 			ext4_truncate_failed_write()
> 			   down_write(mapping.invalidate_lock)
> 
> 			   ext4_truncate()
> 			      ...
> 			      wait @j_wait_transaction_locked
> 
> 						ext_truncate_failed_write()
> 						   down_write(mapping.invalidate_lock)
> 
> 						ext4_should_retry_alloc()
> 						   ...
> 						   __jbd2_log_start_commit()
> 						      wake_up(j_wait_commit)
> jbd2_journal_commit_transaction()
>    wake_up(j_wait_transaction_locked)
> 			   up_write(mapping.invalidate_lock)
> 
> I hope this would help you understand the report.

I see, thanks for explanation! So the above scenario is impossible because
for anyone to block on @j_wait_transaction_locked the transaction must be
committing, which is done only by kjournald2 kthread and so that thread
cannot be waiting at @j_wait_commit. Essentially blocking on
@j_wait_transaction_locked means @j_wait_commit wakeup was already done.

I guess this shows there can be non-trivial dependencies between wait
queues which are difficult to track in an automated way and without such
tracking we are going to see false positives...

								Honza

-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR