linux-ext4 - Re: [PATCH] ext4: Fix deadlock due to mbcache entry corruption

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Y5IFR4K9hO8ax1Y0@mit.edu>
Date:   Thu, 8 Dec 2022 10:39:51 -0500
From:   "Theodore Ts'o" <tytso@....edu>
To:     Jan Kara <jack@...e.cz>
Cc:     Thorsten Leemhuis <regressions@...mhuis.info>,
        Andreas Dilger <adilger.kernel@...ger.ca>,
        linux-ext4@...r.kernel.org, stable@...r.kernel.org,
        Thilo Fromm <t-lo@...ux.microsoft.com>,
        Jeremi Piotrowski <jpiotrowski@...ux.microsoft.com>,
        Andreas Gruenbacher <agruenba@...hat.com>
Subject: Re: [PATCH] ext4: Fix deadlock due to mbcache entry corruption

On Thu, Dec 08, 2022 at 10:15:23AM +0100, Jan Kara wrote:
> > Furthermore, the fix which Jan provided, and which apparently fixes
> > the user's problem, (a) doesn't touch the ext4_bmap function, and (b)
> > has a fixes tag for the patch:
> > 
> >     Fixes: 6048c64b2609 ("mbcache: add reusable flag to cache entries")
> > 
> > ... which is a commit which dates back to 2016, and the v4.6 kernel.  ?!?
> 
> Yes. AFAICT the bitfield race in mbcache was introduced in this commit but
> somehow ext4 was using mbcache in a way that wasn't tripping the race.
> After 65f8b80053 ("ext4: fix race when reusing xattr blocks"), the race
> became much more likely and users started to notice...

Ah, OK.  And 65f8b80053 landed in 6.0, so while the bug may have been
around for much longer, this change made it much more likely that
folks would notice.  That's the missing piece and why Microsoft
started noticing this in their "Flatcar" container kernel.

So I'll update the commit description so that this is more clear, and
then I can figure out how to tell the regression-bot that the
regression should be tracked using commit 65f8b80053 instead of
51ae846cff5 ("ext4: fix warning in ext4_iomap_begin as race between
bmap and write").

Cheers, and thanks for the clarification,

					- Ted