[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y5IFR4K9hO8ax1Y0@mit.edu>
Date: Thu, 8 Dec 2022 10:39:51 -0500
From: "Theodore Ts'o" <tytso@....edu>
To: Jan Kara <jack@...e.cz>
Cc: Thorsten Leemhuis <regressions@...mhuis.info>,
Andreas Dilger <adilger.kernel@...ger.ca>,
linux-ext4@...r.kernel.org, stable@...r.kernel.org,
Thilo Fromm <t-lo@...ux.microsoft.com>,
Jeremi Piotrowski <jpiotrowski@...ux.microsoft.com>,
Andreas Gruenbacher <agruenba@...hat.com>
Subject: Re: [PATCH] ext4: Fix deadlock due to mbcache entry corruption
On Thu, Dec 08, 2022 at 10:15:23AM +0100, Jan Kara wrote:
> > Furthermore, the fix which Jan provided, and which apparently fixes
> > the user's problem, (a) doesn't touch the ext4_bmap function, and (b)
> > has a fixes tag for the patch:
> >
> > Fixes: 6048c64b2609 ("mbcache: add reusable flag to cache entries")
> >
> > ... which is a commit which dates back to 2016, and the v4.6 kernel. ?!?
>
> Yes. AFAICT the bitfield race in mbcache was introduced in this commit but
> somehow ext4 was using mbcache in a way that wasn't tripping the race.
> After 65f8b80053 ("ext4: fix race when reusing xattr blocks"), the race
> became much more likely and users started to notice...
Ah, OK. And 65f8b80053 landed in 6.0, so while the bug may have been
around for much longer, this change made it much more likely that
folks would notice. That's the missing piece and why Microsoft
started noticing this in their "Flatcar" container kernel.
So I'll update the commit description so that this is more clear, and
then I can figure out how to tell the regression-bot that the
regression should be tracked using commit 65f8b80053 instead of
51ae846cff5 ("ext4: fix warning in ext4_iomap_begin as race between
bmap and write").
Cheers, and thanks for the clarification,
- Ted
Powered by blists - more mailing lists