[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e67ea4ee50b7a5e4774d3e91a1bfb4d14bfa308e.camel@redhat.com>
Date: Tue, 23 Apr 2024 09:56:33 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: David J Wilder <dwilder@...ibm.com>, netdev@...r.kernel.org
Subject: Re: [RFC PATCH] net: skb: Increasing allocation in
__napi_alloc_skb() to 2k when needed.
On Fri, 2024-04-19 at 15:23 -0700, David J Wilder wrote:
> When testing CONFIG_MAX_SKB_FRAGS=45 on ppc64le and x86_64 I ran into a
> couple of issues.
>
> __napi_alloc_skb() assumes its smallest fragment allocations will fit in
> 1K. When CONFIG_MAX_SKB_FRAGS is increased this may no longer be true
> resulting in __napi_alloc_skb() reverting to using page_frag_alloc().
> This results in the return of the bug fixed in:
> Commit 3226b158e67c ("net: avoid 32 x truesize under-estimation for
> tiny skbs")
>
> That commit insured that "small skb head fragments are kmalloc backed,
> so that other objects in the slab page can be reused instead of being held
> as long as skbs are sitting in socket queues."
>
> On ppc64le the warning from napi_get_frags_check() is displayed when
> CONFIG_MAX_SKB_FRAGS is set to 45. The purpose of the warning is to detect
> when an increase of MAX_SKB_FRAGS has reintroduced the aforementioned bug.
> Unfortunately on x86_64 this warning is not seen, even though it should be.
> I found the warning was disabled by:
> commit dbae2b062824 ("net: skb: introduce and use a single page frag
> cache")
>
> This RFC patch to __napi_alloc_skb() determines if an skbuff allocation
> with a head fragment of size GRO_MAX_HEAD will fit in a 1k allocation,
> increasing the allocation to 2k if needed.
>
> I have functionally tested this patch, performance testing is still needed.
>
> TBD: Remove the limitation on 4k page size from the single page frag cache
> allowing ppc64le (64K page size) to benefit from this change.
>
> TBD: I have not address the warning in napi_get_frags_check() on x86_64.
> Will the warning still be needed once the other changes are completed?
Thanks for the detailed analysis.
As mentioned by Eric in commit
bf9f1baa279f0758dc2297080360c5a616843927, it should be now possible to
revert dbae2b062824 without incurring in performance regressions for
the relevant use-case. I had that on my todo list since a lot of time,
but I was unable to allocate time for that.
I think such revert would be preferable. Would you be able to evaluate
such option?
Thanks!
Paolo
Powered by blists - more mailing lists