lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b8eb926e-cfc9-b082-5bb9-719be3937c5d@kernel.org>
Date: Mon, 7 Aug 2023 13:42:43 +0200
From: Jesper Dangaard Brouer <hawk@...nel.org>
To: Ratheesh Kannoth <rkannoth@...vell.com>, netdev@...r.kernel.org,
 linux-kernel@...r.kernel.org
Cc: hawk@...nel.org, davem@...emloft.net, edumazet@...gle.com,
 kuba@...nel.org, pabeni@...hat.com,
 Ilias Apalodimas <ilias.apalodimas@...aro.org>,
 Alexander Lobakin <aleksander.lobakin@...el.com>,
 Yunsheng Lin <linyunsheng@...wei.com>,
 Alexander Duyck <alexander.duyck@...il.com>
Subject: Re: [PATCH net-next] page_pool: Clamp ring size to 32K



On 07/08/2023 05.49, Ratheesh Kannoth wrote:
> https://lore.kernel.org/netdev/20230804133512.4dbbbc16@kernel.org/T/
> Capping the recycle ring to 32k instead of returning the error.
> 

Page pool (PP) is just a cache of pages.  The driver octeontx2 (in link)
is creating an excessive large cache of pages.  The drivers RX
descriptor ring size should be independent of the PP ptr_ring size, as
it is just a cache that grows as a functions of the in-flight packet
workload, it functions as a "shock absorber".

32768 pages (4KiB) is approx 128 MiB, and this will be per RX-queue.

The RX-desc ring (obviously) pins down these pages (immediately), but PP
ring starts empty.  As the workload varies the "shock absorber" effect
will let more pages into the system, that will travel the PP ptr_ring.
As all pages originating from the same PP instance will get recycled,
the in-flight pages in the "system" (PP ptr_ring) will grow over time.

The PP design have the problem that it never releases or reduces pages
in this shock absorber "closed" system. (Cc. PP people/devel) we should
consider implementing a MM shrinker callback (include/linux/shrinker.h).

Are the systems using driver octeontx2 ready to handle 128MiB memory per
RX-queue getting pinned down overtime? (this could lead to some strange
do debug situation if the memory is not sufficient)

--Jesper

> Suggested-by: Jakub Kicinski <kuba@...nel.org>
> Signed-off-by: Ratheesh Kannoth <rkannoth@...vell.com>
> ---
>   net/core/page_pool.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index 5d615a169718..404f835a94be 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -182,9 +182,9 @@ static int page_pool_init(struct page_pool *pool,
>   	if (pool->p.pool_size)
>   		ring_qsize = pool->p.pool_size;
>   
> -	/* Sanity limit mem that can be pinned down */
> +	/* Clamp to 32K */
>   	if (ring_qsize > 32768)
> -		return -E2BIG;
> +		ring_qsize = 32768;
>   
>   	/* DMA direction is either DMA_FROM_DEVICE or DMA_BIDIRECTIONAL.
>   	 * DMA_BIDIRECTIONAL is for allowing page used for DMA sending,

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ