lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <871pu4xkcz.fsf@toke.dk>
Date: Mon, 07 Apr 2025 13:49:00 +0200
From: Toke Høiland-Jørgensen <toke@...hat.com>
To: Yunsheng Lin <linyunsheng@...wei.com>, Alexander Lobakin
 <aleksander.lobakin@...el.com>
Cc: "David S. Miller" <davem@...emloft.net>, Jakub Kicinski
 <kuba@...nel.org>, Jesper Dangaard Brouer <hawk@...nel.org>, Saeed
 Mahameed <saeedm@...dia.com>, Leon Romanovsky <leon@...nel.org>, Tariq
 Toukan <tariqt@...dia.com>, Andrew Lunn <andrew+netdev@...n.ch>, Eric
 Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>, Ilias
 Apalodimas <ilias.apalodimas@...aro.org>, Simon Horman <horms@...nel.org>,
 Andrew Morton <akpm@...ux-foundation.org>, Mina Almasry
 <almasrymina@...gle.com>, Yonglong Liu <liuyonglong@...wei.com>, Pavel
 Begunkov <asml.silence@...il.com>, Matthew Wilcox <willy@...radead.org>,
 netdev@...r.kernel.org, bpf@...r.kernel.org, linux-rdma@...r.kernel.org,
 linux-mm@...ck.org, Qiuling Ren <qren@...hat.com>, Yuying Ma
 <yuma@...hat.com>
Subject: Re: [PATCH net-next v7 2/2] page_pool: Track DMA-mapped pages and
 unmap them when destroying the pool

Yunsheng Lin <linyunsheng@...wei.com> writes:

> On 2025/4/5 20:50, Toke Høiland-Jørgensen wrote:
>> Alexander Lobakin <aleksander.lobakin@...el.com> writes:
>> 
>>> From: Alexander Lobakin <aleksander.lobakin@...el.com>
>>> Date: Fri, 4 Apr 2025 17:55:43 +0200
>>>
>>>> From: Toke Høiland-Jørgensen <toke@...hat.com>
>>>> Date: Fri, 04 Apr 2025 12:18:36 +0200
>>>>
>>>>> When enabling DMA mapping in page_pool, pages are kept DMA mapped until
>>>>> they are released from the pool, to avoid the overhead of re-mapping the
>>>>> pages every time they are used. This causes resource leaks and/or
>>>>> crashes when there are pages still outstanding while the device is torn
>>>>> down, because page_pool will attempt an unmap through a non-existent DMA
>>>>> device on the subsequent page return.
>>>>
>>>> [...]
>>>>
>>>>> -#define PP_MAGIC_MASK ~0x3UL
>>>>> +#define PP_MAGIC_MASK ~(PP_DMA_INDEX_MASK | 0x3UL)
>>>>>  
>>>>>  /**
>>>>>   * struct page_pool_params - page pool parameters
>>>>> @@ -173,10 +212,10 @@ struct page_pool {
>>>>>  	int cpuid;
>>>>>  	u32 pages_state_hold_cnt;
>>>>>  
>>>>> -	bool has_init_callback:1;	/* slow::init_callback is set */
>>>>> +	bool dma_sync;			/* Perform DMA sync for device */
>>>>
>>>> Yunsheng said this change to a full bool is redundant in the v6 thread
>>>> ¯\_(ツ)_/¯
>> 
>> AFAIU, the comment was that the second READ_ONCE() when reading the
>> field was redundant, because of the rcu_read_lock(). Which may be the
>> case, but I think keeping it makes the intent of the code clearer. And
>> in any case, it has nothing to do with changing the type of the field...
>
> For changing the type of the field part, there are only two outcomes here
> when using bit field here:
> 1. The reading returns a correct value.
> 2. The reading returns a incorrect value.
>
> So the question seems to be what would possibly go wrong when the reading
> return an incorrect value when there is an additional reading under the rcu
> read lock and there is a rcu sync after clearing pool->dma_sync? Considering
> we only need to ensure there is no dma sync API called after rcu sync.

Okay, so your argument is basically that the barrier in rcu_read_lock()
should prevent the compiler from coalescing the two reads of the
pp->dma_sync field in page_pool_dma_sync_for_device()? And that
READ/WRITE_ONCE() are not needed for the same reason?

> And it seems data_race() can be used to mark the above reading so that KCSAN
> will not complain.

Where would you suggest to add those? Not sure such annotations would
improve readability relative to the current use of READ/WRITE_ONCE()?
The latter is more clear in communicating intent, I would say...

> IOW, changing the type of the field part isn't that necessary as my
> understanding.

Since changing the field doesn't change the size of the structure, I
would be inclined to keep the change for readability reasons, cf the
above.

-Toke


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ