[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5aa08b4f-251e-a63d-c36c-324a04ba24f4@redhat.com>
Date: Tue, 30 Aug 2022 14:17:44 +0200
From: David Hildenbrand <david@...hat.com>
To: John Hubbard <jhubbard@...dia.com>,
Andrew Morton <akpm@...ux-foundation.org>
Cc: Jens Axboe <axboe@...nel.dk>,
Alexander Viro <viro@...iv.linux.org.uk>,
Miklos Szeredi <miklos@...redi.hu>,
Christoph Hellwig <hch@...radead.org>,
"Darrick J . Wong" <djwong@...nel.org>,
Trond Myklebust <trond.myklebust@...merspace.com>,
Anna Schumaker <anna@...nel.org>, Jan Kara <jack@...e.cz>,
Logan Gunthorpe <logang@...tatee.com>,
linux-block@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-xfs@...r.kernel.org, linux-nfs@...r.kernel.org,
linux-mm@...ck.org, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/6] mm/gup: introduce pin_user_page()
On 29.08.22 21:33, John Hubbard wrote:
> On 8/29/22 05:07, David Hildenbrand wrote:
>>> +/**
>>> + * pin_user_page() - apply a FOLL_PIN reference to a page
>>> + *
>>> + * @page: the page to be pinned.
>>> + *
>>> + * This is similar to get_user_pages(), except that the page's refcount is
>>> + * elevated using FOLL_PIN, instead of FOLL_GET.
>
> Actually, my commit log has a more useful documentation of this routine,
> and given the questions below, I think I'll change to that:
>
> * pin_user_page() is an externally-usable version of try_grab_page(), but with
> * semantics that match get_page(), so that it can act as a drop-in replacement
> * for get_page().
> *
> * pin_user_page() elevates a page's refcount using FOLL_PIN rules. This means
> * that the caller must release the page via unpin_user_page().
Some thoughts:
a) Can we generalize such that pages with a dedicated pincount
(multi-page folios) are also covered? Maybe avoiding the refcount
terminology would be best.
b) Should we directly work on folios?
c) Would it be valid to pass in a tail page right now?
>
>>> + *
>>> + * IMPORTANT: The caller must release the page via unpin_user_page().
>>> + *
>>> + */
>>> +void pin_user_page(struct page *page)
>>> +{
>>> + struct folio *folio = page_folio(page);
>>> +
>>> + WARN_ON_ONCE(folio_ref_count(folio) <= 0);
>>> +
>>
>> We should warn if the page is anon and !exclusive.
>
> That would be sort of OK, because pin_user_page() is being created
> specifically for file system (O_DIRECT cases) use, and so the pages
> should mostly be file-backed, rather than anon. Although I'm a little
> vague about whether all of these iov_iter cases are really always
> file-backed pages, especially for cases such as splice(2) to an
> O_DIRECT-opened file, that Al Viro mentioned [1].
If we can, we should document that this interface is not for anonymous
pages and WARN if pinning an anonymous page via this interface.
The only reasonable way to obtain a pin on an anonymous page is via the
page table. Here, FOLL_PIN should be used to do the right thing -- for
example, unshare first (break COW) instead of pinning a shared anonymous
page.
Nothing would speak against duplicating such a pin using this interface
(we'd have to sanity check that the page we're pinning may already be
pinned), but I assume the pages we pin here are *not* necessarily
obtained via GUP FOLL_PIN.
I would be curious under which scenarios we could end up here with an
anonymous page and how we obtained that reference (if not via GUP).
>
> Can you walk me through the reasoning for why we need to keep out
> anon shared pages?
We make sure to only pin anonymous pages that are exclusive and check
when unpinning -- see sanity_check_pinned_pages(), there is also a
comment in there -- that pinned anonymous pages are in fact still
exclusive, otherwise we might have a BUG lurking somewhere that can
result in memory corruptions or leaking information between processes.
For example, once we'd pinned an anonymous pages that are not marked
exclusive (!PageAnonExclusive), or we'd be sharing a page that is
pinned, the next write fault would replace the page in the user page
table due to breaking COW, and the GUP pin would point at a different
page than the page table.
Disallowing pinning of anon pages that may be shared in any case
(FOLL_LONGTERM or not) simplifies GUP handling and allows for such
sanity checks.
(side note: after recent VM_BUG_ON discussions we might want to convert
the VM_BUG_ON_PAGE in sanity_check_pinned_pages())
>
>>
>> I assume the intend is to use pin_user_page() only to duplicate pins, right?
>>
>
> Well, yes or no, depending on your use of the term "pin":
>
> pin_user_page() is used on a page that already has a refcount >= 1 (so
> no worries about speculative pinning should apply here), but the page
> does not necessarily have any FOLL_PIN's applied to it yet (so it's not
> "pinned" in the FOLL_PIN sense).
Okay, then we should really figure out if/how anonymous pages could end
up here. I assume they can't, really. But let's see :)
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists