lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20221024104628.ozxjtdrotysq2haj@quack3>
Date:   Mon, 24 Oct 2022 12:46:28 +0200
From:   Jan Kara <jack@...e.cz>
To:     Thilo Fromm <t-lo@...ux.microsoft.com>
Cc:     Jan Kara <jack@...e.cz>, Ye Bin <yebin10@...wei.com>,
        jack@...e.com, tytso@....edu, linux-ext4@...r.kernel.org,
        regressions@...ts.linux.dev,
        Jeremi Piotrowski <jpiotrowski@...ux.microsoft.com>
Subject: Re: [syzbot] possible deadlock in jbd2_journal_lock_updates

On Fri 21-10-22 12:23:41, Thilo Fromm wrote:
> Hello Honza,
> 
> > > Just want to make sure this does not get lost - as mentioned earlier,
> > > reverting 51ae846cff5 leads to a kernel build that does not have this issue.
> > 
> > Yes, I'm aware of this and still cannot quite wrap my head how it could be
> > given the stacktraces I see :) They do not seem to come anywhere near that
> > code...
> 
> Just reaching out to let folks know that we see more reports on this issue
> coming in for kernels >=5.15.63, see
> https://github.com/flatcar/Flatcar/issues/847#issuecomment-1286523602.

Yeah, I was pondering about this for some time but still I have no clue who
could be holding the buffer lock (which blocks the task holding the
transaction open) or how this could related to the commit you have
identified. I have two things to try:

1) Can you please check whether the deadlock reproduces also with 6.0
kernel? The thing is that xattr handling code in ext4 has there some
additional changes, commit 307af6c8793 ("mbcache: automatically delete
entries from cache on freeing") in particular. 

2) I have created a debug patch (against 5.15.x stable kernel). Can you
please reproduce the failure with it and post the output of "echo w
>/proc/sysrq-trigger" and also the output the debug patch will put into the
kernel log? It will dump the information about buffer lock owner if we
cannot get the lock for more than 32 seconds.

Thanks for your help and patience.

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ