netdev - Re: [PATCH net v2 1/2] page_pool: fix timing for checking and disabling napi

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <64eeeee0-8ec8-458b-b01a-ac1dd6c96583@huawei.com>
Date: Fri, 27 Sep 2024 11:58:47 +0800
From: Yunsheng Lin <linyunsheng@...wei.com>
To: Joe Damato <jdamato@...tly.com>, <davem@...emloft.net>, <kuba@...nel.org>,
	<pabeni@...hat.com>, <liuyonglong@...wei.com>, <fanghaiqing@...wei.com>,
	<zhangkun09@...wei.com>, Alexander Lobakin <aleksander.lobakin@...el.com>,
	Jesper Dangaard Brouer <hawk@...nel.org>, Ilias Apalodimas
	<ilias.apalodimas@...aro.org>, Eric Dumazet <edumazet@...gle.com>,
	<netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<mkarsten@...terloo.ca>
Subject: Re: [PATCH net v2 1/2] page_pool: fix timing for checking and
 disabling napi_local

On 2024/9/27 4:06, Joe Damato wrote:
> On Wed, Sep 25, 2024 at 03:57:06PM +0800, Yunsheng Lin wrote:
>> page_pool page may be freed from skb_defer_free_flush() to
>> softirq context, it may cause concurrent access problem for
>> pool->alloc cache due to the below time window, as below,
>> both CPU0 and CPU1 may access the pool->alloc cache
>> concurrently in page_pool_empty_alloc_cache_once() and
>> page_pool_recycle_in_cache():
>>
>>           CPU 0                           CPU1
>>     page_pool_destroy()          skb_defer_free_flush()
>>            .                               .
>>            .                   page_pool_put_unrefed_page()
>>            .                               .
>>            .               allow_direct = page_pool_napi_local()
>>            .                               .
>> page_pool_disable_direct_recycling()       .
>>            .                               .
>> page_pool_empty_alloc_cache_once() page_pool_recycle_in_cache()
>>
>> Use rcu mechanism to avoid the above concurrent access problem.
>>
>> Note, the above was found during code reviewing on how to fix
>> the problem in [1].
>>
>> 1. https://lore.kernel.org/lkml/8067f204-1380-4d37-8ffd-007fc6f26738@kernel.org/T/
>>
>> Fixes: dd64b232deb8 ("page_pool: unlink from napi during destroy")
>> Signed-off-by: Yunsheng Lin <linyunsheng@...wei.com>
>> CC: Alexander Lobakin <aleksander.lobakin@...el.com>
> 
> Sorry for the noise, but I hit an assert in page_pool_unref_netmem
> and I am trying to figure out if it is related to what you all are
> debugging? I thought it might be, but if not, my apologies.
> 
> Just in case it is, I've put the backtrace on github [1]. I
> triggered this while testing an RFC [2] I've been working on. Please
> note, the RFC posted publicly does not currently apply cleanly to
> net-next and has some bugs I've fixed in my v4. I had planned to
> send the v4 early next week and mention the page pool issue I am
> hitting.
> 
> After triggering the assert in [1], I tried applying the patches of
> this series and retesting the RFC v4 I have queued locally. When I
> did that, I hit a new assertion page_pool_destroy [3].
> 
> There are a few possibilities:
>    1. I am hitting the same issue you are hitting
>    2. I am hitting a different issue caused by a bug I introduced
>    3. I am hitting a different page pool issue entirely
> 
> In case of 2 and 3, my apologies for the noise.
> 
> In case of 1: If you think I am hitting the same issue as you are
> trying to solve, I can reliably reproduce this with my RFC v4 and
> would be happy to test any patches meant to fix the issue.
> 
> [1]: https://gist.githubusercontent.com/jdamato-fsly/eb628c8bf4e4d1c8158441644cdb7e52/raw/96dcf422303d9e64b5060f2fb0f1d71e04ab048e/warning1.txt

It seems there may be some concurrent access of page->pp_ref_count/
page_pool->frag_users or imbalance‌ incrementing and decrementing of
page->pp_ref_count.

For the concurrent access part, you may refer to the below patch
for debugging, but it seems mlx5 driver doesn't use frag API directly
as my last recall, so you can't use it directly.

https://lore.kernel.org/all/20240903122208.3379182-1-linyunsheng@huawei.com/

> [2]: https://lore.kernel.org/all/20240912100738.16567-1-jdamato@fastly.com/#r
> [3]: https://gist.githubusercontent.com/jdamato-fsly/eb628c8bf4e4d1c8158441644cdb7e52/raw/96dcf422303d9e64b5060f2fb0f1d71e04ab048e/warning2.txt

There warning is only added to see if there is any infight page that
need dma unmapping when page_pool_destroy() is called, you can ignore
that warning or remove that WARN_ONCE() in page_pool_item_uninit()
when testing.

>