linux-kernel - Re: [PATCH net v6 4/5] net: macb: single dma_alloc

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250925085154.GW836419@horms.kernel.org>
Date: Thu, 25 Sep 2025 09:51:54 +0100
From: Simon Horman <horms@...nel.org>
To: Théo Lebrun <theo.lebrun@...tlin.com>
Cc: Andrew Lunn <andrew+netdev@...n.ch>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	Rob Herring <robh@...nel.org>,
	Krzysztof Kozlowski <krzk+dt@...nel.org>,
	Conor Dooley <conor+dt@...nel.org>,
	Nicolas Ferre <nicolas.ferre@...rochip.com>,
	Claudiu Beznea <claudiu.beznea@...on.dev>,
	Geert Uytterhoeven <geert@...ux-m68k.org>,
	Harini Katakam <harini.katakam@...inx.com>,
	Richard Cochran <richardcochran@...il.com>,
	Russell King <linux@...linux.org.uk>, netdev@...r.kernel.org,
	devicetree@...r.kernel.org, linux-kernel@...r.kernel.org,
	Thomas Petazzoni <thomas.petazzoni@...tlin.com>,
	Tawfik Bayouk <tawfik.bayouk@...ileye.com>,
	Sean Anderson <sean.anderson@...ux.dev>
Subject: Re: [PATCH net v6 4/5] net: macb: single dma_alloc_coherent() for
 DMA descriptors

On Tue, Sep 23, 2025 at 06:00:26PM +0200, Théo Lebrun wrote:
> Move from 2*NUM_QUEUES dma_alloc_coherent() for DMA descriptor rings to
> 2 calls overall.
> 
> Issue is with how all queues share the same register for configuring the
> upper 32-bits of Tx/Rx descriptor rings. Taking Tx, notice how TBQPH
> does *not* depend on the queue index:
> 
> 	#define GEM_TBQP(hw_q)		(0x0440 + ((hw_q) << 2))
> 	#define GEM_TBQPH(hw_q)		(0x04C8)
> 
> 	queue_writel(queue, TBQP, lower_32_bits(queue->tx_ring_dma));
> 	#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
> 	if (bp->hw_dma_cap & HW_DMA_CAP_64B)
> 		queue_writel(queue, TBQPH, upper_32_bits(queue->tx_ring_dma));
> 	#endif
> 
> To maximise our chances of getting valid DMA addresses, we do a single
> dma_alloc_coherent() across queues. This improves the odds because
> alloc_pages() guarantees natural alignment. Other codepaths (IOMMU or
> dev/arch dma_map_ops) don't give high enough guarantees
> (even page-aligned isn't enough).
> 
> Two consideration:
> 
>  - dma_alloc_coherent() gives us page alignment. Here we remove this
>    constraint meaning each queue's ring won't be page-aligned anymore.
> 
>  - This can save some tiny amounts of memory. Fewer allocations means
>    (1) less overhead (constant cost per alloc) and (2) less wasted bytes
>    due to alignment constraints.
> 
>    Example for (2): 4 queues, default ring size (512), 64-bit DMA
>    descriptors, 16K pages:
>     - Before: 8 allocs of 8K, each rounded to 16K => 64K wasted.
>     - After:  2 allocs of 32K => 0K wasted.
> 
> Fixes: 02c958dd3446 ("net/macb: add TX multiqueue support for gem")
> Reviewed-by: Sean Anderson <sean.anderson@...ux.dev>
> Acked-by: Nicolas Ferre <nicolas.ferre@...rochip.com>
> Tested-by: Nicolas Ferre <nicolas.ferre@...rochip.com> # on sam9x75
> Signed-off-by: Théo Lebrun <theo.lebrun@...tlin.com>

Reviewed-by: Simon Horman <horms@...nel.org>