[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c1902449-ab9d-4e26-c532-5df0a73dc1f9@redhat.com>
Date: Thu, 13 Apr 2023 14:41:43 +0200
From: David Hildenbrand <david@...hat.com>
To: David Howells <dhowells@...hat.com>
Cc: "Teterevkov, Ivan" <Ivan.Teterevkov@....com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"jhubbard@...dia.com" <jhubbard@...dia.com>,
"jack@...e.cz" <jack@...e.cz>,
"rppt@...ux.ibm.com" <rppt@...ux.ibm.com>,
"jglisse@...hat.com" <jglisse@...hat.com>,
"ira.weiny@...el.com" <ira.weiny@...el.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Christoph Hellwig <hch@....de>,
Matthew Wilcox <willy@...radead.org>
Subject: Re: find_get_page() VS pin_user_pages()
On 12.04.23 10:41, David Howells wrote:
> David Hildenbrand <david@...hat.com> wrote:
>
>> I suspect that find_get_page() is not the kind of interface you want to use
>> for the purpose you describe. find_get_page() is a wrapper around
>> pagecache_get_page() and seems more like a helper for implementing an fs
>> (looking at the users and the fact that it only considers pages that are in
>> the pagecache).
>
> Btw, at some point we're going to need public functions to get extra pins on
> pages. vmsplice() should be pinning the pages it pushes into a pipe - so all
> pages in a pipe should probably be pinned - and anyone who splices a page out
> of a pipe and retains it (skbuffs spring strongly to mind) should also get a
> pin on the page.
As discussed, vmsplice() is a bit special, because it has
longterm-pinning semantics: we'd want to migrate the page out of
ZONE_MOVABLE/MIGRATE_CMA/... because the page might remain pinned in the
pipe possibly forever, controlled by user space.
pin_user_pages(FOLL_LONGTERM) would do the right thing, but we might
ahve to be careful with extra pins.
I guess it depends on what we want to achieve. Let's discuss what would
happen when we want to pin some page (and not going via pin_user_page())
that's definitely not an anon page -- so let's assume a pagecache page:
(a) Short-term pinning when already pinned (extra pins): easy.
(b) Short-term pinning when not pinned yet: should be fairly easy
(pin_user_pages() doesn't do anything special for pagecache pages
either).
(c) Long-term pinning when already long-term pinned (extra long-term
pinnings): easy
(d) Long-term pinning when already short-term pinned: problematic,
because we might have to migrate the page first, but it's already
pinned ... and if we obtained the page via pin_user_page() from a
MAP_PRIVATE VMA, we'd have to do another
pin_user_page(FOLL_LONGTERM) that would properly break COW and give
us an anon page ...
(e) Long-term pinning when not pinned yet: fairly easy, but we might
have to migrate the page first (like FOLL_LONGTERM would).
Regarding anon pages, we should pin only via pin_user_page(), so the
"not pinned" case does not apply. Replicating pins -- (a) and (c) -- is
usually easy, but (d) is similarly problematic.
Focusing again on !anon pages: if it's just "get another short-term pin
on an already pinned page", it's easy (and I recall John H. had
patches). If it's "get a long-term pin on an already pinned page", it
can be problematic.
Any pages that will never have to be migrated when long-term pinning
(just some allocated kernel page without MOVABLE semantics) are super
easy to pin, and to add extra pins to.
>
> So should all pages held by an skbuff be pinned rather than ref'd? I have a
> patch to use the bottom two bits of an skb frag's page pointer to keep track
> of whether the page it points to is ref'd, pinned or neither, but if we can
> make it pin/not-pin them, I only need one bit for that.
It might possibly be the right thing. But ref'd vs. pinned really only
makes a difference to (a) pages mapped into user space or (b) pages in
the pageache. Of course, in any case, long-term semantics have to be
respected if the page to pin might have been allocated with MOVABLE
semantics.
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists