netdev - Re: [PATCH net-next 0/3] page_pool: allow caching from safely localized NAPI

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d8585228-da36-7b19-67f6-1aadb301e39a@gmail.com>
Date:   Wed, 12 Apr 2023 09:56:29 +0300
From:   Tariq Toukan <ttoukan.linux@...il.com>
To:     Jakub Kicinski <kuba@...nel.org>, davem@...emloft.net
Cc:     netdev@...r.kernel.org, edumazet@...gle.com, pabeni@...hat.com,
        hawk@...nel.org, ilias.apalodimas@...aro.org,
        linyunsheng@...wei.com, Dragos Tatulea <dtatulea@...dia.com>,
        Gal Pressman <gal@...dia.com>,
        Saeed Mahameed <saeedm@...dia.com>,
        Tariq Toukan <tariqt@...dia.com>
Subject: Re: [PATCH net-next 0/3] page_pool: allow caching from safely
 localized NAPI



On 12/04/2023 9:41, Tariq Toukan wrote:
> Hi,
> 
> Happy holidays! I was traveling and couldn't review the RFCs.
> 
> Dragos and I had this task in our plans after our recent page pool work 
> in mlx5e. We're so glad to see this patchset! :)
> 
> On 11/04/2023 23:17, Jakub Kicinski wrote:
>> I went back to the explicit "are we in NAPI method", mostly
>> because I don't like having both around :( (even tho I maintain
>> that in_softirq() && !in_hardirq() is as safe, as softirqs do
>> not nest).
>>
>> Still returning the skbs to a CPU, tho, not to the NAPI instance.
>> I reckon we could create a small refcounted struct per NAPI instance
>> which would allow sockets and other users so hold a persisent
>> and safe reference. But that's a bigger change, and I get 90+%
>> recycling thru the cache with just these patches (for RR and
>> streaming tests with 100% CPU use it's almost 100%).
>>
>> Some numbers for streaming test with 100% CPU use (from previous version,
>> but really they perform the same):
>>
>>         HW-GRO                page=page
>>         before        after        before        after
>> recycle:
>> cached:            0    138669686        0    150197505
>> cache_full:        0       223391        0        74582
>> ring:        138551933         9997191    149299454        0
>> ring_full:         0             488         3154       127590
>> released_refcnt:    0        0        0        0
>>
> 
> Impressive.
> 
> Dragos tested your RFC v1.
> He can test this one as well, expecting the same effect.
> 
>> alloc:
>> fast:        136491361    148615710    146969587    150322859
>> slow:             1772         1799          144          105
>> slow_high_order:    0        0        0        0
>> empty:             1772         1799          144          105
>> refill:          2165245       156302      2332880         2128
>> waive:            0        0        0        0
>>
> 
> General note:
> For fragmented page-pool pages, the decision to whether go though the 
> cache or the ring depends only on the last releasing thread and its 
> context.
> 
> Our in-driver deferred release trick is in many cases beneficial even in 
> the presence of this idea. The sets of cases improved by each idea 
> intersect, but are not completely identical.
> That's why we decided to go with both solutions working together, and 
> not only one.
> 
>> v1:
>>   - rename the arg in_normal_napi -> napi_safe
>>   - also allow recycling in __kfree_skb_defer()
>> rfcv2: 
>> https://lore.kernel.org/all/20230405232100.103392-1-kuba@kernel.org/
>>
>> Jakub Kicinski (3):
>>    net: skb: plumb napi state thru skb freeing paths
>>    page_pool: allow caching from safely localized NAPI
>>    bnxt: hook NAPIs to page pools
>>
>>   Documentation/networking/page_pool.rst    |  1 +
>>   drivers/net/ethernet/broadcom/bnxt/bnxt.c |  1 +
>>   include/linux/netdevice.h                 |  3 ++
>>   include/linux/skbuff.h                    | 20 +++++++----
>>   include/net/page_pool.h                   |  3 +-
>>   net/core/dev.c                            |  3 ++
>>   net/core/page_pool.c                      | 15 ++++++--
>>   net/core/skbuff.c                         | 42 ++++++++++++-----------
>>   8 files changed, 58 insertions(+), 30 deletions(-)
>>
> 

Reviewed-by: Tariq Toukan <tariqt@...dia.com>

We'll push a patch to enable it in mlx5e immediately once this is accepted.