[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190118001608.GX4205@dastard>
Date: Fri, 18 Jan 2019 11:16:08 +1100
From: Dave Chinner <david@...morbit.com>
To: Jerome Glisse <jglisse@...hat.com>
Cc: John Hubbard <jhubbard@...dia.com>, Jan Kara <jack@...e.cz>,
Matthew Wilcox <willy@...radead.org>,
Dan Williams <dan.j.williams@...el.com>,
John Hubbard <john.hubbard@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Linux MM <linux-mm@...ck.org>, tom@...pey.com,
Al Viro <viro@...iv.linux.org.uk>, benve@...co.com,
Christoph Hellwig <hch@...radead.org>,
Christopher Lameter <cl@...ux.com>,
"Dalessandro, Dennis" <dennis.dalessandro@...el.com>,
Doug Ledford <dledford@...hat.com>,
Jason Gunthorpe <jgg@...pe.ca>,
Michal Hocko <mhocko@...nel.org>, mike.marciniszyn@...el.com,
rcampbell@...dia.com,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions
On Thu, Jan 17, 2019 at 10:21:08AM -0500, Jerome Glisse wrote:
> On Wed, Jan 16, 2019 at 09:42:25PM -0800, John Hubbard wrote:
> > On 1/16/19 5:08 AM, Jerome Glisse wrote:
> > > On Wed, Jan 16, 2019 at 12:38:19PM +0100, Jan Kara wrote:
> > >> That actually touches on another question I wanted to get opinions on. GUP
> > >> can be for read and GUP can be for write (that is one of GUP flags).
> > >> Filesystems with page cache generally have issues only with GUP for write
> > >> as it can currently corrupt data, unexpectedly dirty page etc.. DAX & memory
> > >> hotplug have issues with both (DAX cannot truncate page pinned in any way,
> > >> memory hotplug will just loop in kernel until the page gets unpinned). So
> > >> we probably want to track both types of GUP pins and page-cache based
> > >> filesystems will take the hit even if they don't have to for read-pins?
> > >
> > > Yes the distinction between read and write would be nice. With the map
> > > count solution you can only increment the mapcount for GUP(write=true).
> > > With pin bias the issue is that a big number of read pin can trigger
> > > false positive ie you would do:
> > > GUP(vaddr, write)
> > > ...
> > > if (write)
> > > atomic_add(page->refcount, PAGE_PIN_BIAS)
> > > else
> > > atomic_inc(page->refcount)
> > >
> > > PUP(page, write)
> > > if (write)
> > > atomic_add(page->refcount, -PAGE_PIN_BIAS)
> > > else
> > > atomic_dec(page->refcount)
> > >
> > > I am guessing false positive because of too many read GUP is ok as
> > > it should be unlikely and when it happens then we take the hit.
> > >
> >
> > I'm also intrigued by the point that read-only GUP is harmless, and we
> > could just focus on the writeable case.
>
> For filesystem anybody that just look at the page is fine, as it would
> not change its content thus the page would stay stable.
Other processes can access and dirty the page cache page while there
is a GUP reference. It's unclear to me whether that changes what
GUP needs to do here, but we can't assume a page referenced for
read-only GUP will be clean and unchanging for the duration of the
GUP reference. It may even be dirty at the time of the read-only
GUP pin...
Cheers,
Dave.
--
Dave Chinner
david@...morbit.com
Powered by blists - more mailing lists