[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e2a77778-7a2b-2811-95ff-be67a44afceb@leemhuis.info>
Date: Thu, 8 Dec 2022 18:16:02 +0100
From: Thorsten Leemhuis <regressions@...mhuis.info>
To: Theodore Ts'o <tytso@....edu>, Jan Kara <jack@...e.cz>
Cc: Andreas Dilger <adilger.kernel@...ger.ca>,
linux-ext4@...r.kernel.org, stable@...r.kernel.org,
Thilo Fromm <t-lo@...ux.microsoft.com>,
Jeremi Piotrowski <jpiotrowski@...ux.microsoft.com>,
Andreas Gruenbacher <agruenba@...hat.com>
Subject: Re: [PATCH] ext4: Fix deadlock due to mbcache entry corruption
On 08.12.22 16:39, Theodore Ts'o wrote:
> On Thu, Dec 08, 2022 at 10:15:23AM +0100, Jan Kara wrote:
>>> Furthermore, the fix which Jan provided, and which apparently fixes
>>> the user's problem, (a) doesn't touch the ext4_bmap function, and (b)
>>> has a fixes tag for the patch:
>>>
>>> Fixes: 6048c64b2609 ("mbcache: add reusable flag to cache entries")
>>>
>>> ... which is a commit which dates back to 2016, and the v4.6 kernel. ?!?
>>
>> Yes. AFAICT the bitfield race in mbcache was introduced in this commit but
>> somehow ext4 was using mbcache in a way that wasn't tripping the race.
>> After 65f8b80053 ("ext4: fix race when reusing xattr blocks"), the race
>> became much more likely and users started to notice...
>
> Ah, OK. And 65f8b80053 landed in 6.0, so while the bug may have been
> around for much longer, this change made it much more likely that
> folks would notice. That's the missing piece and why Microsoft
> started noticing this in their "Flatcar" container kernel.
Yeah, likely when 65f8b80053 was backported to 5.15.y in 1be97463696c
> So I'll update the commit description so that this is more clear,
Thx for taking care of this, I'm glad this is on track now.
Maybe I should talk to Greg again to revert backported changes like
1be97463696c until fixes for them are ready.
> and
> then I can figure out how to tell the regression-bot that the
> regression should be tracked using commit 65f8b80053 instead of
> 51ae846cff5 ("ext4: fix warning in ext4_iomap_begin as race between
> bmap and write").
FWIW, there is no strong need to, nobody looks at those details once the
regression is fixed. But yeah, that might change over time, so let me
take care of that:
#regzbot introduced: 65f8b80053
[normally things like that have to be done as a direct or indirect reply
to the report, but regzbot knows (famos last words...) how to associate
this command with the report, as the patch that started this thread
linked to the report using a Link: tag].
Ciao, Thorsten
Powered by blists - more mailing lists