lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b071b158-6569-4bda-8886-6058897a7e28@gmail.com>
Date: Wed, 2 Oct 2024 09:52:51 +0800
From: Yunsheng Lin <yunshenglin0825@...il.com>
To: Paolo Abeni <pabeni@...hat.com>, Yunsheng Lin <linyunsheng@...wei.com>,
 davem@...emloft.net, kuba@...nel.org
Cc: liuyonglong@...wei.com, fanghaiqing@...wei.com, zhangkun09@...wei.com,
 Alexander Lobakin <aleksander.lobakin@...el.com>,
 Jesper Dangaard Brouer <hawk@...nel.org>,
 Ilias Apalodimas <ilias.apalodimas@...aro.org>,
 Eric Dumazet <edumazet@...gle.com>, netdev@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH net v2 1/2] page_pool: fix timing for checking and
 disabling napi_local

On 10/1/2024 7:30 PM, Paolo Abeni wrote:

...

>> @@ -828,6 +837,9 @@ void page_pool_put_unrefed_netmem(struct page_pool 
>> *pool, netmem_ref netmem,
>>           recycle_stat_inc(pool, ring_full);
>>           page_pool_return_page(pool, netmem);
>>       }
>> +
>> +    if (!allow_direct_orig)
>> +        rcu_read_unlock();
> 
> What about always acquiring the rcu lock? would that impact performances 
> negatively?
> 
> If not, I think it's preferable, as it would make static checker happy.

As mentioned in cover letter, the overhead is about ~2ns
I guess it is the 'if' checking before rcu_read_unlock that static
checker is not happy about, there is also a 'if' checking before
the 'destroy_lock' introduced in patch 2, maybe '__cond_acquires'
can be used to make static checker happy?

> 
>>   }
>>   EXPORT_SYMBOL(page_pool_put_unrefed_netmem);
> 
> [...]
> 
>> @@ -1121,6 +1140,12 @@ void page_pool_destroy(struct page_pool *pool)
>>           return;
>>       page_pool_disable_direct_recycling(pool);
>> +
>> +    /* Wait for the freeing side see the disabling direct recycling 
>> setting
>> +     * to avoid the concurrent access to the pool->alloc cache.
>> +     */
>> +    synchronize_rcu();
> 
> When turning on/off a device with a lot of queues, the above could 
> introduce a lot of long waits under the RTNL lock, right?
> 
> What about moving the trailing of this function in a separate helper and 
> use call_rcu() instead?

For this patch, yes, it can be done.
But patch 2 also rely on the rcu lock in this patch to ensure that free
side is synchronized with the destroy path, and the dma mapping done for
the inflight pages in page_pool_item_uninit() can not be done in 
call_rcu(), as the driver might have unbound when RCU callback is
called, which might defeat the purpose of patch 2.

Maybe an optimization here is to only call synchronize_rcu() when there
are some inflight pages and pool->dma_map is set.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ