linux-kernel - Re: [PATCH net-next v7 2/8] page_pool: fix timing for checking and disabling napi

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <22de6033-744e-486e-bbd9-8950249cd018@huawei.com>
Date: Tue, 14 Jan 2025 21:03:43 +0800
From: Yunsheng Lin <linyunsheng@...wei.com>
To: Yunsheng Lin <yunshenglin0825@...il.com>,
	Toke Høiland-Jørgensen <toke@...hat.com>,
	<davem@...emloft.net>, <kuba@...nel.org>, <pabeni@...hat.com>
CC: <zhangkun09@...wei.com>, <liuyonglong@...wei.com>,
	<fanghaiqing@...wei.com>, Alexander Lobakin <aleksander.lobakin@...el.com>,
	Xuan Zhuo <xuanzhuo@...ux.alibaba.com>, Jesper Dangaard Brouer
	<hawk@...nel.org>, Ilias Apalodimas <ilias.apalodimas@...aro.org>, Eric
 Dumazet <edumazet@...gle.com>, Simon Horman <horms@...nel.org>,
	<netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net-next v7 2/8] page_pool: fix timing for checking and
 disabling napi_local

On 2025/1/11 13:24, Yunsheng Lin wrote:

...

>>>   }
>>>     void page_pool_put_unrefed_netmem(struct page_pool *pool, netmem_ref netmem,
>>> @@ -1165,6 +1172,12 @@ void page_pool_destroy(struct page_pool *pool)
>>>       if (!page_pool_release(pool))
>>>           return;
>>>   +    /* Paired with rcu lock in page_pool_napi_local() to enable clearing
>>> +     * of pool->p.napi in page_pool_disable_direct_recycling() is seen
>>> +     * before returning to driver to free the napi instance.
>>> +     */
>>> +    synchronize_rcu();
>>
>> Most drivers call page_pool_destroy() in a loop for each RX queue, so
>> now you're introducing a full synchronize_rcu() wait for each queue.
>> That can delay tearing down the device significantly, so I don't think
>> this is a good idea.
> 
> synchronize_rcu() is called after page_pool_release(pool), which means
> it is only called when there are some inflight pages, so there is not
> necessarily a full synchronize_rcu() wait for each queue.
> 
> Anyway, it seems that there are some cases that need explicit
> synchronize_rcu() and some cases depending on the other API providing
> synchronize_rcu() semantics, maybe we provide two diffferent API for
> both cases like the netif_napi_del()/__netif_napi_del() APIs do?

As the synchronize_rcu() is also needed to fix the DMA API misuse problem,
we can not really handle it like netif_napi_del()/__netif_napi_del() APIs
do, the best I can think is something like below:

bool need_sync = false;

for (each queue)
	need_sync |= page_pool_destroy_prepare(queue->pool);

if (need_sync)
	synchronize_rcu()

for (each queue)
	page_pool_destroy_commit(queue->pool);

But I am not sure if the above worth the effort or not for now as the
synchronize_rcu() is only called for the inflight page case.
Any better idea? If not, maybe we can optimize the above later if
the synchronize_rcu() does turn out to be a problem.

>