lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 8 Dec 2022 00:55:55 -0500
From:   "Theodore Ts'o" <tytso@....edu>
To:     Thorsten Leemhuis <regressions@...mhuis.info>
Cc:     Andreas Dilger <adilger.kernel@...ger.ca>, Jan Kara <jack@...e.cz>,
        linux-ext4@...r.kernel.org, stable@...r.kernel.org,
        Thilo Fromm <t-lo@...ux.microsoft.com>,
        Jeremi Piotrowski <jpiotrowski@...ux.microsoft.com>,
        Andreas Gruenbacher <agruenba@...hat.com>
Subject: Re: [PATCH] ext4: Fix deadlock due to mbcache entry corruption

On Mon, Dec 05, 2022 at 04:41:49PM +0100, Thorsten Leemhuis wrote:
> 
> Jan's patch to fix the regression is now our 12 days out and afaics
> didn't make any progress (or did I miss something?). Is there are reason
> why or did it simply fall through the cracks? Just asking, because it
> would be good to finally get this resolved.
> 
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

This patch showed up right before the Thanksgiving holiday, and (b) it
just missed Q/A cutoff for the the ext4 bugfix pull request which I
sent to Linus right before I went on my Thanksgiving break.

Since Thanksgiving, I've been busy with the realities of corporate
life --- end of year performance evaluations, preparing for 2023
roadmap reviews with management, etc.  So the next pull request I was
planning to send to Linus is when the merge window opens, and I'm
currently processing patches and running Q/A to be ready for the
opening of that merge window.


One thing which is completely unclear to me is how this relates to the
claimed regression.  I understand that Jeremi and Thilo have asserted
that the hang goes away if a backport commit 51ae846cff5 ("ext4: fix
warning in ext4_iomap_begin as race between bmap and write") is not in
their 5.15 product tree.

However, the stack traces point to a problem in the extended attribute
code, which has nothing to do with ext4_bmap(), and commit 51ae846cff5
only changes the ext4's bmap function --- which these days gets used
for the FIBMAP ioctl and very little else.

Furthermore, the fix which Jan provided, and which apparently fixes
the user's problem, (a) doesn't touch the ext4_bmap function, and (b)
has a fixes tag for the patch:

    Fixes: 6048c64b2609 ("mbcache: add reusable flag to cache entries")

... which is a commit which dates back to 2016, and the v4.6 kernel.  ?!?

So at this point, I have no idea whether or not this is a regression
or not, but we'll get the fix to Linus soon.

Cheers,

	   	    	      	      	 - Ted

Powered by blists - more mailing lists