netdev - Re: [PATCH net-next v7 2/2] page_pool: Track DMA-mapped pages and unmap them when destroying the pool

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f8bbfe7e-9935-4f4d-a9e8-b3547ed58112@huawei.com>
Date: Mon, 7 Apr 2025 19:26:33 +0800
From: Yunsheng Lin <linyunsheng@...wei.com>
To: Toke Høiland-Jørgensen <toke@...hat.com>, Alexander
 Lobakin <aleksander.lobakin@...el.com>
CC: "David S. Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>,
	Jesper Dangaard Brouer <hawk@...nel.org>, Saeed Mahameed <saeedm@...dia.com>,
	Leon Romanovsky <leon@...nel.org>, Tariq Toukan <tariqt@...dia.com>, Andrew
 Lunn <andrew+netdev@...n.ch>, Eric Dumazet <edumazet@...gle.com>, Paolo Abeni
	<pabeni@...hat.com>, Ilias Apalodimas <ilias.apalodimas@...aro.org>, Simon
 Horman <horms@...nel.org>, Andrew Morton <akpm@...ux-foundation.org>, Mina
 Almasry <almasrymina@...gle.com>, Yonglong Liu <liuyonglong@...wei.com>,
	Pavel Begunkov <asml.silence@...il.com>, Matthew Wilcox
	<willy@...radead.org>, <netdev@...r.kernel.org>, <bpf@...r.kernel.org>,
	<linux-rdma@...r.kernel.org>, <linux-mm@...ck.org>, Qiuling Ren
	<qren@...hat.com>, Yuying Ma <yuma@...hat.com>
Subject: Re: [PATCH net-next v7 2/2] page_pool: Track DMA-mapped pages and
 unmap them when destroying the pool

On 2025/4/5 20:50, Toke Høiland-Jørgensen wrote:
> Alexander Lobakin <aleksander.lobakin@...el.com> writes:
> 
>> From: Alexander Lobakin <aleksander.lobakin@...el.com>
>> Date: Fri, 4 Apr 2025 17:55:43 +0200
>>
>>> From: Toke Høiland-Jørgensen <toke@...hat.com>
>>> Date: Fri, 04 Apr 2025 12:18:36 +0200
>>>
>>>> When enabling DMA mapping in page_pool, pages are kept DMA mapped until
>>>> they are released from the pool, to avoid the overhead of re-mapping the
>>>> pages every time they are used. This causes resource leaks and/or
>>>> crashes when there are pages still outstanding while the device is torn
>>>> down, because page_pool will attempt an unmap through a non-existent DMA
>>>> device on the subsequent page return.
>>>
>>> [...]
>>>
>>>> -#define PP_MAGIC_MASK ~0x3UL
>>>> +#define PP_MAGIC_MASK ~(PP_DMA_INDEX_MASK | 0x3UL)
>>>>  
>>>>  /**
>>>>   * struct page_pool_params - page pool parameters
>>>> @@ -173,10 +212,10 @@ struct page_pool {
>>>>  	int cpuid;
>>>>  	u32 pages_state_hold_cnt;
>>>>  
>>>> -	bool has_init_callback:1;	/* slow::init_callback is set */
>>>> +	bool dma_sync;			/* Perform DMA sync for device */
>>>
>>> Yunsheng said this change to a full bool is redundant in the v6 thread
>>> ¯\_(ツ)_/¯
> 
> AFAIU, the comment was that the second READ_ONCE() when reading the
> field was redundant, because of the rcu_read_lock(). Which may be the
> case, but I think keeping it makes the intent of the code clearer. And
> in any case, it has nothing to do with changing the type of the field...

For changing the type of the field part, there are only two outcomes here
when using bit field here:
1. The reading returns a correct value.
2. The reading returns a incorrect value.

So the question seems to be what would possibly go wrong when the reading
return an incorrect value when there is an additional reading under the rcu
read lock and there is a rcu sync after clearing pool->dma_sync? Considering
we only need to ensure there is no dma sync API called after rcu sync.

And it seems data_race() can be used to mark the above reading so that KCSAN
will not complain.

IOW, changing the type of the field part isn't that necessary as my understanding.