[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wiAQ23ongRsJTdYhpQRn2YP-2-Z4_NkWiSJRyv6wf_dxg@mail.gmail.com>
Date: Tue, 24 Sep 2024 12:24:14 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Chris Mason <clm@...a.com>
Cc: Matthew Wilcox <willy@...radead.org>, Jens Axboe <axboe@...nel.dk>,
Dave Chinner <david@...morbit.com>, Christian Theune <ct@...ingcircus.io>, linux-mm@...ck.org,
"linux-xfs@...r.kernel.org" <linux-xfs@...r.kernel.org>, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, Daniel Dao <dqminh@...udflare.com>,
regressions@...ts.linux.dev, regressions@...mhuis.info
Subject: Re: Known and unfixed active data loss bug in MM + XFS with large
folios since Dec 2021 (any kernel from 6.1 upwards)
On Tue, 24 Sept 2024 at 12:18, Chris Mason <clm@...a.com> wrote:
>
> A few days of load later and some extra printks, it turns out that
> taking the writer lock in __filemap_add_folio() makes us dramatically
> more likely to just return EEXIST than go into the xas_split_alloc() dance.
.. and that sounds like a good thing, except for the test coverage, I guess.
Which you seem to have fixed:
> With the changes in 6.10, we only get into that xas_destroy() case above
> when the conflicting entry is a shadow entry, so I changed my repro to
> use memory pressure instead of fadvise.
>
> I also added a schedule_timeout(1) after the split alloc, and with all
> of that I'm able to consistently make the xas_destroy() case trigger
> without causing any system instability. Kairui Song's patches do seem
> to have fixed things nicely.
<confused thumbs up / fingers crossed emoji>
Linus
Powered by blists - more mailing lists