[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201203165937.GU5487@ziepe.ca>
Date: Thu, 3 Dec 2020 12:59:37 -0400
From: Jason Gunthorpe <jgg@...pe.ca>
To: Pavel Tatashin <pasha.tatashin@...een.com>
Cc: LKML <linux-kernel@...r.kernel.org>, linux-mm <linux-mm@...ck.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Vlastimil Babka <vbabka@...e.cz>,
Michal Hocko <mhocko@...e.com>,
David Hildenbrand <david@...hat.com>,
Oscar Salvador <osalvador@...e.de>,
Dan Williams <dan.j.williams@...el.com>,
Sasha Levin <sashal@...nel.org>,
Tyler Hicks <tyhicks@...ux.microsoft.com>,
Joonsoo Kim <iamjoonsoo.kim@....com>, mike.kravetz@...cle.com,
Steven Rostedt <rostedt@...dmis.org>,
Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Mel Gorman <mgorman@...e.de>,
Matthew Wilcox <willy@...radead.org>,
David Rientjes <rientjes@...gle.com>,
John Hubbard <jhubbard@...dia.com>
Subject: Re: [PATCH 6/6] mm/gup: migrate pinned pages out of movable zone
On Thu, Dec 03, 2020 at 11:40:15AM -0500, Pavel Tatashin wrote:
> > Looking at this code some more.. How is it even correct?
> >
> > 1633 if (!isolate_lru_page(head)) {
> > 1634 list_add_tail(&head->lru, &cma_page_list);
> >
> > Here we are only running under the read side of the mmap sem so multiple
> > GUPs can be calling that sequence in parallel. I don't see any
> > obvious exclusion that will prevent corruption of head->lru. The first
> > GUP thread to do isolate_lru_page() will ClearPageLRU() and the second
> > GUP thread will be a NOP for isolate_lru_page().
> >
> > They will both race list_add_tail and other list ops. That is not OK.
>
> Good question. I studied it, and I do not see how this is OK. Worse,
> this race is also exposable as a syscall instead of via driver: two
> move_pages() run simultaneously. Perhaps in other places?
>
> move_pages()
> kernel_move_pages()
> mmget()
> do_pages_move()
> add_page_for_migratio()
> mmap_read_lock(mm);
> list_add_tail(&head->lru, pagelist); <- Not protected
When this was CMA only it might have been rarer to trigger, but this
move stuff sounds like it makes it much more broadly, eg on typical
servers with RDMA exposed/etc
Seems like it needs fixing as part of this too :\
Page at a time inside the gup loop could address both concerns, unsure
about batching performance here though..
Jason
Powered by blists - more mailing lists