[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAPcyv4ifG5_jCyURNVNHE2cYKbVWYuzVydstaXWr6VPOZxoZ-A@mail.gmail.com>
Date: Wed, 16 Oct 2019 13:02:08 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: Thomas Hellström (VMware)
<thomas_os@...pmail.org>
Cc: "Kirill A. Shutemov" <kirill@...temov.name>,
Matthew Wilcox <willy@...radead.org>,
linux-mm <linux-mm@...ck.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Thomas Hellstrom <thellstrom@...are.com>
Subject: Re: [RFC PATCH] mm: Fix a huge pud insertion race during faulting
On Tue, Oct 15, 2019 at 10:59 PM Thomas Hellström (VMware)
<thomas_os@...pmail.org> wrote:
>
> Hi, Dan,
>
> On 10/16/19 3:44 AM, Dan Williams wrote:
> > On Tue, Oct 15, 2019 at 3:06 AM Kirill A. Shutemov <kirill@...temov.name> wrote:
> >> On Tue, Oct 08, 2019 at 11:37:11AM +0200, Thomas Hellström (VMware) wrote:
> >>> From: Thomas Hellstrom <thellstrom@...are.com>
> >>>
> >>> A huge pud page can theoretically be faulted in racing with pmd_alloc()
> >>> in __handle_mm_fault(). That will lead to pmd_alloc() returning an
> >>> invalid pmd pointer. Fix this by adding a pud_trans_unstable() function
> >>> similar to pmd_trans_unstable() and check whether the pud is really stable
> >>> before using the pmd pointer.
> >>>
> >>> Race:
> >>> Thread 1: Thread 2: Comment
> >>> create_huge_pud() Fallback - not taken.
> >>> create_huge_pud() Taken.
> >>> pmd_alloc() Returns an invalid pointer.
> >>>
> >>> Cc: Matthew Wilcox <willy@...radead.org>
> >>> Fixes: a00cc7d9dd93 ("mm, x86: add support for PUD-sized transparent hugepages")
> >>> Signed-off-by: Thomas Hellstrom <thellstrom@...are.com>
> >>> ---
> >>> RFC: We include pud_devmap() as an unstable PUD flag. Is this correct?
> >>> Do the same for pmds?
> >> I *think* it is correct and we should do the same for PMD, but I may be
> >> wrong.
> >>
> >> Dan, Matthew, could you comment on this?
> > The _devmap() check in these paths near _trans_unstable() has always
> > been about avoiding assumptions that the corresponding page might be
> > page cache or anonymous which for dax it's neither and does not behave
> > like a typical page.
>
> The concern here is that _trans_huge() returns false for _devmap()
> pages, which means that also _trans_unstable() returns false.
>
> Still, I figure someone could zap the entry at any time using madvise(),
> so AFAICT the entry is indeed unstable, and it's a bug not to include
> _devmap() in the _trans_unstable() functions?
Yes, I can't think a case where it is wrong to include _devmap() in a
_trans_unstable(). It may be unnecessary if the given path can't
reasonably ever encounter a file-backed dax mapping, but it's
otherwise ok to always consider _devmap().
Powered by blists - more mailing lists