[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=whBOXM3mh-QtzK-EQtDEHQLcziAXu07KxU1crUc5jiQUg@mail.gmail.com>
Date: Thu, 4 Nov 2021 16:37:25 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: David Sterba <dsterba@...e.cz>, Qu Wenruo <wqu@...e.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>
Subject: Re: Kmap-related crashes and memory leaks on 32bit arch (5.15+)
On Thu, Nov 4, 2021 at 3:09 PM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> I'm obviously not on my laptop right now, but I did look at the btrfs
> lzo code earlier today again, and didn't see anything that looked
> remotely suspicious.
Ok, back at the laptop, and once again looking at this.
I'm stumped.
So if I understand correctly,
5.15 + 2cf3f8133bda ("btrfs: fix lzo_decompress_bio() kmap leakage")
is fine.
Also, 5.15 with the folio merge, plus the fix for that (commit
e66435936756: "mm: fix mismerge of folio page flag manipulators") is
fine too. I
The tested "tip of the day" that was bad was dcd68326d29b ("Merge tag
'devicetree-for-5.16' of git://...").
Since you already tested the folio merge, there really isn't a whole
lot of mm changes left in there. Andrew hasn't sent his patch-bombs
yet.
Doing a
gitk 49f8275c7d9247cf..037c50bfbeb33b \
mm/ include/linux/highmem* fs/btrfs/
really doesn't show anything that looks particularly suspicious.
There's some sync_bdev() changes, and some polling changes, but they
look trivial.
The only half-way relevant thing that remains is my merge, which very
much had conflicts around kmap/kunmap because of the revert problems.
So I continue to think that I must have screwed up, but I still don't
see which kmap/kunmap would be wrong.
I'll just repeat my suggestion here since the original email didn't go
to the lists.
> (a) test your side before my merge with your alternate kmap fix
> commit (the one you had in the other branch to make it allegedly
> easier to resolve)?
>
> (b) if that works, re-do the merge using that kmap pattern?
Your kmap() pattern is slightly different from mine. I tried to avoid
an unnecessary "kmap again" in copy_compressed_data_to_page(), so my
kmap lifetime is slightly longer over that loop.
Having looked at it once more, it still looks "ObviouslyCorrect(tm)"
to me, but I suspect I'm just being blind to some obvious bug.
> If (a) works, but (b) still fails, then it must be some odd
> interaction issue with something else. Which sounds unlikely, since I
> don't think we really had anything that should affect kmap or anything
> in this area, but who knows...
And bisection ends up perhaps somewhat painful, but sounds like the
way to go if there's no other path forward.
Linus
Powered by blists - more mailing lists