[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200625112455.GC17788@quack2.suse.cz>
Date: Thu, 25 Jun 2020 13:24:55 +0200
From: Jan Kara <jack@...e.cz>
To: John Hubbard <jhubbard@...dia.com>
Cc: Jason Gunthorpe <jgg@...pe.ca>,
Chris Wilson <chris@...is-wilson.co.uk>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, intel-gfx@...ts.freedesktop.org,
Andrew Morton <akpm@...ux-foundation.org>,
Jan Kara <jack@...e.cz>,
Jérôme Glisse <jglisse@...hat.com>,
Claudio Imbrenda <imbrenda@...ux.ibm.com>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>
Subject: Re: [PATCH] mm: Skip opportunistic reclaim for dma pinned pages
On Wed 24-06-20 17:11:30, John Hubbard wrote:
> On 2020-06-24 16:20, Jason Gunthorpe wrote:
> > > I do like this code change, though. And I *think* it's actually safe to
> > > do this, as it stays away from writeback or other filesystem activity.
> > > But let me double check that, in case I'm forgetting something.
>
> ...OK, I've checked, and I like it a little bit less now. Mainly for
> structural reasons, though. I think it would work correctly. But
> here's a concern: try_to_unmap() should only fail to unmap if there is a
> reason to not unmap. Having a page be pinned for dma is a reason to not
> *free* a page, and it's also a reason to be careful about writeback and
> page buffers for writeback and such. But I'm not sure that it's a reason
> to fail to remove mappings.
>
> True, most (all?) of the reasons that we remove mappings, generally are
> for things that are not allowed while a page is dma-pinned...at least,
> today. But still, there's nothing fundamental about a mapping that
> should prevent it from coming or going while a page is undergoing
> dma.
>
> So, it's merely a convenient, now-misnamed location in the call stack
> to fail out. That's not great. It might be better, as Jason hints at
> below, to fail out a little earlier, instead. That would lead to a more
> places to call page_maybe_dma_pinned(), but that's not a real problem,
> because it's still a small number of places.
Agreed, I think that shrink_page_list() as Michal writes is a better place
for the page_may_be_dma_pinned() check. But other than that I agree it is
good to avoid all the unnecessary work of preparing a page for reclaim when
we can now check there's no hope of reclaiming it (at this time).
> > I don't know, but could it be that try_to_unmap() has to be done
> > before checking the refcount as each mapping is included in the
> > refcount? ie we couldn't know a DMA pin was active in advance?
> >
> > Now that we have your pin stuff we can detect a DMA pin without doing
> > all the unmaps?
> >
>
> Once something calls pin_user_page*(), then the pages will be marked
> as dma-pinned, yes. So no, there is no need to wait until try_to_unmap()
> to find out.
Yeah, I'm pretty sure the page ref check happens so late because earlier it
was impossible to tell whether there are "external" page references reclaim
cannot get rid of - filesystem may (but need not!) hold a page reference
for its private data, page tables will hold references as well, etc. Now
with page_may_be_dma_pinned() we can detect at least one class of external
references (i.e. pins) early so it's stupid not to do that.
> A final note: depending on where page_maybe_dma_pinned() ends up
> getting called, this might prevent a fair number of the problems that
> Jan originally reported [1], and that I also reported separately!
>
> Well, not all of the problems, and only after the filesystems get
> converted to call pin_user_pages() (working on that next), but...I think
> it would actually avoid the crash our customer reported back in early
> 2018. Even though we don't have the full file lease + pin_user_pages()
> solution in place.
>
> That's because reclaim is what triggers the problems that we saw. And
> with this patch, we bail out of reclaim early.
I agree that with this change, some races will become much less likely for
some usecases. But as you say, it's not a full solution.
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists