[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190926035844.GA89510@google.com>
Date: Wed, 25 Sep 2019 21:58:44 -0600
From: Yu Zhao <yuzhao@...gle.com>
To: "Kirill A. Shutemov" <kirill@...temov.name>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Michal Hocko <mhocko@...e.com>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...hat.com>,
Namhyung Kim <namhyung@...nel.org>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
Vlastimil Babka <vbabka@...e.cz>,
Hugh Dickins <hughd@...gle.com>,
Jérôme Glisse <jglisse@...hat.com>,
Andrea Arcangeli <aarcange@...hat.com>,
"Aneesh Kumar K . V" <aneesh.kumar@...ux.ibm.com>,
David Rientjes <rientjes@...gle.com>,
Matthew Wilcox <willy@...radead.org>,
Lance Roy <ldr709@...il.com>,
Ralph Campbell <rcampbell@...dia.com>,
Jason Gunthorpe <jgg@...pe.ca>,
Dave Airlie <airlied@...hat.com>,
Thomas Hellstrom <thellstrom@...are.com>,
Souptick Joarder <jrdr.linux@...il.com>,
Mel Gorman <mgorman@...e.de>, Jan Kara <jack@...e.cz>,
Mike Kravetz <mike.kravetz@...cle.com>,
Huang Ying <ying.huang@...el.com>,
Aaron Lu <ziqian.lzq@...fin.com>,
Omar Sandoval <osandov@...com>,
Thomas Gleixner <tglx@...utronix.de>,
Vineeth Remanan Pillai <vpillai@...italocean.com>,
Daniel Jordan <daniel.m.jordan@...cle.com>,
Mike Rapoport <rppt@...ux.ibm.com>,
Joel Fernandes <joel@...lfernandes.org>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v2] mm: don't expose page to fast gup prematurely
On Wed, Sep 25, 2019 at 03:17:50PM +0300, Kirill A. Shutemov wrote:
> On Tue, Sep 24, 2019 at 04:05:50PM -0600, Yu Zhao wrote:
> > On Tue, Sep 24, 2019 at 02:23:16PM +0300, Kirill A. Shutemov wrote:
> > > On Sat, Sep 14, 2019 at 01:05:18AM -0600, Yu Zhao wrote:
> > > > We don't want to expose page to fast gup running on a remote CPU
> > > > before all local non-atomic ops on page flags are visible first.
> > > >
> > > > For anon page that isn't in swap cache, we need to make sure all
> > > > prior non-atomic ops, especially __SetPageSwapBacked() in
> > > > page_add_new_anon_rmap(), are order before set_pte_at() to prevent
> > > > the following race:
> > > >
> > > > CPU 1 CPU1
> > > > set_pte_at() get_user_pages_fast()
> > > > page_add_new_anon_rmap() gup_pte_range()
> > > > __SetPageSwapBacked() SetPageReferenced()
> > >
> > > Is there a particular codepath that has what you listed for CPU?
> > > After quick look, I only saw that we page_add_new_anon_rmap() called
> > > before set_pte_at().
> >
> > I think so. One in do_swap_page() and another in unuse_pte(). Both
> > are on KSM paths. Am I referencing a stale copy of the source?
>
> I *think* it is a bug. Setting a pte before adding the page to rmap may
> lead to rmap (like try_to_unmap() or something) to miss the VMA.
>
> Do I miss something?
We have the pages locked in those two places, so for try_to_unmap()
and the rest of page_vma_mapped_walk() users, they will block on
the page lock:
CPU 1 CPU 2
lock_page()
set_pte_at()
unlock_page()
lock_page()
try_to_unmap()
page_vma_mapped_walk()
pte_present() without holding ptl
unlock_page()
For others that don't use page_vma_mapped_walk(), they should either
lock pages or grab ptl before checking pte_present().
AFAIK, the fast gup is the only one doesn't fall into the either
category.
Powered by blists - more mailing lists