[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f14cb268-efd2-1b62-22cb-d501f1f183a7@huawei.com>
Date: Sat, 9 May 2020 08:58:00 +0800
From: Yunsheng Lin <linyunsheng@...wei.com>
To: Andrew Lunn <andrew@...n.ch>,
Sunil Kovvuri <sunil.kovvuri@...il.com>
CC: Kevin Hao <haokexin@...il.com>,
Linux Netdev List <netdev@...r.kernel.org>,
Sunil Goutham <sgoutham@...vell.com>,
"Geetha sowjanya" <gakula@...vell.com>,
Subbaraya Sundeep <sbhatta@...vell.com>,
hariprasad <hkelam@...vell.com>,
"David S. Miller" <davem@...emloft.net>
Subject: Re: [PATCH] octeontx2-pf: Use the napi_alloc_frag() to alloc the pool
buffers
On 2020/5/8 21:01, Andrew Lunn wrote:
> On Fri, May 08, 2020 at 01:08:13PM +0530, Sunil Kovvuri wrote:
>> On Fri, May 8, 2020 at 11:00 AM Kevin Hao <haokexin@...il.com> wrote:
>>>
>>> On Fri, May 08, 2020 at 10:18:27AM +0530, Sunil Kovvuri wrote:
>>>> On Fri, May 8, 2020 at 9:43 AM Kevin Hao <haokexin@...il.com> wrote:
>>>>>
>>>>> In the current codes, the octeontx2 uses its own method to allocate
>>>>> the pool buffers, but there are some issues in this implementation.
>>>>> 1. We have to run the otx2_get_page() for each allocation cycle and
>>>>> this is pretty error prone. As I can see there is no invocation
>>>>> of the otx2_get_page() in otx2_pool_refill_task(), this will leave
>>>>> the allocated pages have the wrong refcount and may be freed wrongly.
>>>>
>>>> Thanks for pointing, will fix.
>>>>
>>>>> 2. It wastes memory. For example, if we only receive one packet in a
>>>>> NAPI RX cycle, and then allocate a 2K buffer with otx2_alloc_rbuf()
>>>>> to refill the pool buffers and leave the remain area of the allocated
>>>>> page wasted. On a kernel with 64K page, 62K area is wasted.
>>>>>
>>>>> IMHO it is really unnecessary to implement our own method for the
>>>>> buffers allocate, we can reuse the napi_alloc_frag() to simplify
>>>>> our code.
>>>>>
>>>>> Signed-off-by: Kevin Hao <haokexin@...il.com>
>>>>
>>>> Have you measured performance with and without your patch ?
>>>
>>> I will do performance compare later. But I don't think there will be measurable
>>> difference.
>>>
>>>> I didn't use napi_alloc_frag() as it's too costly, if in one NAPI
>>>> instance driver
>>>> receives 32 pkts, then 32 calls to napi_alloc_frag() and updates to page ref
>>>> count per fragment etc are costly.
>>>
>>> No, the page ref only be updated at the page allocation and all the space are
>>> used. In general, the invocation of napi_alloc_frag() will not cause the update
>>> of the page ref. So in theory, the count of updating page ref should be reduced
>>> by using of napi_alloc_frag() compare to the current otx2 implementation.
>>>
>>
>> Okay, it seems i misunderstood it.
>
> Hi Sunil
>
> In general, you should not work around issues in the core, you should
> improve the core. If your implementation really was more efficient
> than the core code, it would of been better if you proposed fixes to
> the core, not hide away better code in your own driver.
Hi, Andrew
When looking the napi_alloc_frag() api, the mapping/unmapping is done by
caller, if the mapping/unmapping is managed in the core, then the
mapping/unmapping can be avoided when the page is reused, because the
mapping/unmapping operation is costly when IOMMU is on, do you think it
makes sense to do the mapping/ummapping in the page_frag_*()?
>
> Andrew
> .
>
Powered by blists - more mailing lists