netdev - Re: [PATCH net-next v2 2/3] page_pool: support non-frag page for page_pool_alloc

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2e4f0359-151a-5cff-6d31-0ea0f014ef9a@huawei.com>
Date: Wed, 31 May 2023 20:19:07 +0800
From: Yunsheng Lin <linyunsheng@...wei.com>
To: Alexander H Duyck <alexander.duyck@...il.com>, <davem@...emloft.net>,
	<kuba@...nel.org>, <pabeni@...hat.com>
CC: <netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>, Lorenzo Bianconi
	<lorenzo@...nel.org>, Jesper Dangaard Brouer <hawk@...nel.org>, Ilias
 Apalodimas <ilias.apalodimas@...aro.org>, Eric Dumazet <edumazet@...gle.com>
Subject: Re: [PATCH net-next v2 2/3] page_pool: support non-frag page for
 page_pool_alloc_frag()

On 2023/5/30 23:07, Alexander H Duyck wrote:
...

>> +	if (PAGE_POOL_DMA_USE_PP_FRAG_COUNT) {
>> +		*offset = 0;
>> +		return page_pool_alloc_pages(pool, gfp);
>> +	}
>> +
> 
> This is a recipe for pain. Rather than doing this I would say we should
> stick with our existing behavior and not allow page pool fragments to
> be used when the DMA address is consuming the region. Otherwise we are
> going to make things very confusing.

Are there any other concern other than confusing? we could add a
big comment to make it clear.

The point of adding that is to avoid the driver handling the
PAGE_POOL_DMA_USE_PP_FRAG_COUNT when using page_pool_alloc_frag()
like something like below:

if (!PAGE_POOL_DMA_USE_PP_FRAG_COUNT)
	page = page_pool_alloc_frag()
else
	page = XXXXX;

Or do you perfer the driver handling it? why?

> 
> If we have to have both version I would much rather just have some
> inline calls in the header wrapped in one #ifdef for
> PAGE_POOL_DMA_USE_PP_FRAG_COUNT that basically are a wrapper for
> page_pool pages treated as pp_frag.

Do you have a good name in mind for that wrapper.
In addition to the naming, which API should I use when I am a driver
author wanting to add page pool support?

> 
>>  	size = ALIGN(size, dma_get_cache_alignment());
>> -	*offset = pool->frag_offset;
>>  
> 
> If we are going to be allocating mono-frag pages they should be
> allocated here based on the size check. That way we aren't discrupting
> the performance for the smaller fragments and the code below could
> function undisturbed.

It is to allow possible optimization as below.

> 
>> -	if (page && *offset + size > max_size) {
>> +	if (page) {
>> +		*offset = pool->frag_offset;
>> +
>> +		if (*offset + size <= max_size) {
>> +			pool->frag_users++;
>> +			pool->frag_offset = *offset + size;
>> +			alloc_stat_inc(pool, fast);
>> +			return page;

Note that we still allow frag page here when '(size << 1 > max_size)'.

>> +		}
>> +
>> +		pool->frag_page = NULL;
>>  		page = page_pool_drain_frag(pool, page);
>>  		if (page) {
>>  			alloc_stat_inc(pool, fast);
>> @@ -714,26 +727,24 @@ struct page *page_pool_alloc_frag(struct page_pool *pool,
>>  		}
>>  	}
>>  
>> -	if (!page) {
>> -		page = page_pool_alloc_pages(pool, gfp);
>> -		if (unlikely(!page)) {
>> -			pool->frag_page = NULL;
>> -			return NULL;
>> -		}
>> -
>> -		pool->frag_page = page;
>> +	page = page_pool_alloc_pages(pool, gfp);
>> +	if (unlikely(!page))
>> +		return NULL;
>>  
>>  frag_reset:
>> -		pool->frag_users = 1;
>> +	/* return page as non-frag page if a page is not able to
>> +	 * hold two frags for the current requested size.
>> +	 */
> 
> This statement ins't exactly true since you make all page pool pages
> into fragmented pages.

Any suggestion to describe it more accurately?
I wrote that thinking frag_count being one as non-frag page.

> 
> 
>> +	if (unlikely(size << 1 > max_size)) {
> 
> This should happen much sooner so you aren't mixing these allocations
> with the smaller ones and forcing the fragmented page to be evicted.

As mentioned above, it is to allow a possible optimization