[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a473720f-ae8c-42f4-8f87-987a2d9151f9@gmail.com>
Date: Wed, 20 Dec 2023 00:49:10 +0000
From: Pavel Begunkov <asml.silence@...il.com>
To: Mina Almasry <almasrymina@...gle.com>, David Wei <dw@...idwei.uk>
Cc: io-uring@...r.kernel.org, netdev@...r.kernel.org,
Jens Axboe <axboe@...nel.dk>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jesper Dangaard Brouer
<hawk@...nel.org>, David Ahern <dsahern@...nel.org>
Subject: Re: [RFC PATCH v3 03/20] net: page pool: rework ppiov life cycle
On 12/19/23 23:35, Mina Almasry wrote:
> On Tue, Dec 19, 2023 at 1:04 PM David Wei <dw@...idwei.uk> wrote:
>>
>> From: Pavel Begunkov <asml.silence@...il.com>
>>
>> NOT FOR UPSTREAM
>> The final version will depend on how the ppiov infra looks like
>>
>> Page pool is tracking how many pages were allocated and returned, which
>> serves for refcounting the pool, and so every page/frag allocated should
>> eventually come back to the page pool via appropriate ways, e.g. by
>> calling page_pool_put_page().
>>
>> When it comes to normal page pools (i.e. without memory providers
>> attached), it's fine to return a page when it's still refcounted by
>> somewhat in the stack, in which case we'll "detach" the page from the
>> pool and rely on page refcount for it to return back to the kernel.
>>
>> Memory providers are different, at least ppiov based ones, they need
>> all their buffers to eventually return back, so apart from custom pp
>> ->release handlers, we'll catch when someone puts down a ppiov and call
>> its memory provider to handle it, i.e. __page_pool_iov_free().
>>
>> The first problem is that __page_pool_iov_free() hard coded devmem
>> handling, and other providers need a flexible way to specify their own
>> callbacks.
>>
>> The second problem is that it doesn't go through the generic page pool
>> paths and so can't do the mentioned pp accounting right. And we can't
>> even safely rely on page_pool_put_page() to be called somewhere before
>> to do the pp refcounting, because then the page pool might get destroyed
>> and ppiov->pp would point to garbage.
>>
>> The solution is to make the pp ->release callback to be responsible for
>> properly recycling its buffers, e.g. calling what was
>> __page_pool_iov_free() before in case of devmem.
>> page_pool_iov_put_many() will be returning buffers to the page pool.
>>
>
> Hmm this patch is working on top of slightly outdated code. I think> the correct solution here is to transition to using pp_ref_count for
> refcounting the ppiovs/niovs. Once we do that, we no longer need
> special refcounting for ppiovs, they're refcounted identically to
> pages, makes the pp more maintainable, gives us some unified handling
> of page pool refcounting, it becomes trivial to support fragmented
> pages which require a pp_ref_count, and all the code in this patch can
> go away.
>
> I'm unsure if this patch is just because you haven't rebased to my
> latest RFC (which is completely fine by me), or if you actually think
> using pp_ref_count for refcounting is wrong and want us to go back to
> the older model which required some custom handling for ppiov and
> disabled frag support. I'm guessing it's the former, but please
> correct if I'm wrong.
Right, it's based on older patches, it'd be a fool's work keep
rebasing it while the code is still changing unless there is a
good reason for that.
I haven't taken a look at devmem v5, I definitely going to. IMHO,
this approach is versatile and clear, but if there is a better one,
I'm all for it.
--
Pavel Begunkov
Powered by blists - more mailing lists