netdev - Re: [RFC PATCH net-next] page_pool: Track DMA-mapped pages and unmap them when destroying the pool

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <871pv2igd9.fsf@toke.dk>
Date: Wed, 12 Mar 2025 13:27:14 +0100
From: Toke Høiland-Jørgensen <toke@...hat.com>
To: Yunsheng Lin <linyunsheng@...wei.com>, Yunsheng Lin
 <yunshenglin0825@...il.com>, Andrew Morton <akpm@...ux-foundation.org>,
 Jesper Dangaard Brouer <hawk@...nel.org>, Ilias Apalodimas
 <ilias.apalodimas@...aro.org>, "David S. Miller" <davem@...emloft.net>
Cc: Yonglong Liu <liuyonglong@...wei.com>, Mina Almasry
 <almasrymina@...gle.com>, Eric Dumazet <edumazet@...gle.com>, Jakub
 Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Simon Horman
 <horms@...nel.org>, linux-mm@...ck.org, netdev@...r.kernel.org
Subject: Re: [RFC PATCH net-next] page_pool: Track DMA-mapped pages and
 unmap them when destroying the pool

Yunsheng Lin <linyunsheng@...wei.com> writes:

> On 2025/3/11 21:26, Toke Høiland-Jørgensen wrote:
>> Yunsheng Lin <linyunsheng@...wei.com> writes:
>> 
>>> On 2025/3/10 23:24, Toke Høiland-Jørgensen wrote:
>>>
>>>>>
>>>>> I guess that is one of the disadvantages that an advanced struct like
>>>>> Xarray is used:(
>>>>
>>>> Sure, there will be some overhead from using xarray, but I think the
>>>> simplicity makes up for it; especially since we can limit this to the
>>>
>>> As my understanding, it is more complicated, it is just that
>>> complexity is hidden before xarray now.
>> 
>> Yes, which encapsulates the complexity into a shared abstraction that is
>> widely used in the kernel, so it does not add new complexity to the
>> kernel as a whole. Whereas your series adds a whole bunch of new
>> complexity to the kernel in the form of a new slab allocator.
>> 
>>> Even if there is no space in 'struct page' to store the id, the
>>> 'struct page' pointer itself can be used as id if the xarray can
>>> use pointer as id. But it might mean the memory utilization might
>>> not be as efficient as it should be, and performance hurts too if
>>> there is more memory to be allocated and freed.
>> 
>> I don't think it can. But sure, there can be other ways around this.
>> 
>> FWIW, I don't think your idea of allocating page_pool_items to use as an
>> indirection is totally crazy, but all the complexity around it (the
>> custom slab allocator etc) is way too much. And if we can avoid the item
>> indirection that is obviously better.
>> 
>>> It seems it is just a matter of choices between using tailor-made
>>> page_pool specific optimization and using some generic advanced
>>> struct like xarray.
>> 
>> Yup, basically.
>> 
>>> I chose the tailor-made one because it ensure least overhead as
>>> much as possibe from performance and memory utilization perspective,
>>> for example, the 'single producer, multiple consumer' guarantee
>>> offered by NAPI context can avoid some lock and atomic operation.
>> 
>> Right, and my main point is that the complexity of this is not worth it :)
>
> Without the complexity, there is about 400ns performance overhead
> for Xarray compared to about 10ns performance overhead for the
> tailor-made one, which means there is about more than 200% performance
> degradation for page_pool03_slow test case:

Great, thanks for testing! I will include this in the non-RFC posting of
this series :)

>>>> cases where it's absolutely needed.
>>>
>>> The above can also be done for using page_pool_item too as the
>>> lower 2 bits can be used to indicate the pointer in 'struct page'
>>> is 'page_pool_item' or 'page_pool', I just don't think it is
>>> necessary yet as it might add more checking in the fast path.
>> 
>> Yup, did think about using the lower bits to distinguish if it does turn
>> out that we can't avoid an indirection. See above; it's not actually the
>
> The 'memdesc' seems like an indirection to me when using that to shrink
> 'struct page' to a smaller size.

Yes, it does seem like we'll end up with an indirection of some kind
eventually. But let's cross that bridge when we get to it...

-Toke