[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG48ez08GhsK00-J=1hbZrccB7uZ10EbN8i1Zj4pfp4V=LZEZA@mail.gmail.com>
Date: Mon, 7 Jun 2021 21:55:09 +0200
From: Jann Horn <jannh@...gle.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: Linux-MM <linux-mm@...ck.org>, Zi Yan <ziy@...dia.com>,
Peter Xu <peterx@...hat.com>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Konstantin Khlebnikov <khlebnikov@...dex-team.ru>,
Andrew Morton <akpm@...ux-foundation.org>,
chinwen.chang@...iatek.com,
kernel list <linux-kernel@...r.kernel.org>,
syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
Vlastimil Babka <vbabka@...e.cz>,
Michel Lespinasse <walken@...gle.com>,
syzbot <syzbot+1f52b3a18d5633fa7f82@...kaller.appspotmail.com>
Subject: Re: split_huge_page_to_list() races with page_mapcount() on migration
entry in smaps code? [was: Re: [syzbot] kernel BUG in __page_mapcount]
On Mon, Jun 7, 2021 at 8:03 PM Matthew Wilcox <willy@...radead.org> wrote:
> On Mon, Jun 07, 2021 at 07:27:23PM +0200, Jann Horn wrote:
> > === Short summary ===
> > I believe the issue here is a race between /proc/*/smaps and
> > split_huge_page_to_list():
> >
> > The codepath for /proc/*/smaps walks the pagetables and (e.g. in
> > smaps_account()) calls page_mapcount() not just on pages from normal
> > PTEs but also on migration entries (since commit b1d4d9e0cbd0a
> > "proc/smaps: carefully handle migration entries", from Linux v3.5).
> > page_mapcount() expects compound pages to be stable.
> >
> > The split_huge_page_to_list() path first protects the compound page by
> > locking it and replacing all its PTEs with migration entries (since
> > the THP rewrite in v4.5, I think?), then does the actual splitting
> > using __split_huge_page().
> >
> > So there's a mismatch of expectations here:
> > The smaps code expects that migration entries point to stable compound
> > pages, while the THP code expects that it's okay to split a compound
> > page while it has migration entries.
>
> Will it be a colossal performance penalty if we always get the page
> refcount after looking it up? That will cause split_huge_page() to
> fail to split the page if it hits this race.
Hmm - but with that approach I'm not sure you could even easily take a
refcount on a page whose refcount may be frozen and which may be in
the middle of being shattered? get_page_unless_zero() is wrong because
you can't take references on tail pages, right? (Or can you?) And
try_get_page() is wrong because it bugs out if the refcount is zero -
and even if it didn't do that, you might end up holding a reference on
the head page while the page you're actually interested in is a tail
page?
I guess if it was really necessary, it'd be possible to do some kind
of retry thing that grabs a reference on the compound head, then
checks that the tail page is still associated with the compound head,
and if not, drops the compound head and tries again?
Powered by blists - more mailing lists