lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aP-cgMiJ-y_PX7Xa@horms.kernel.org>
Date: Mon, 27 Oct 2025 16:23:28 +0000
From: Simon Horman <horms@...nel.org>
To: Alexander Lobakin <aleksander.lobakin@...el.com>
Cc: intel-wired-lan@...ts.osuosl.org,
	Tony Nguyen <anthony.l.nguyen@...el.com>,
	Przemek Kitszel <przemyslaw.kitszel@...el.com>,
	Andrew Lunn <andrew+netdev@...n.ch>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	nxne.cnse.osdt.itp.upstreaming@...el.com, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH iwl-next] ice: implement configurable header split for
 regular Rx

On Mon, Oct 06, 2025 at 06:20:53PM +0200, Alexander Lobakin wrote:
> Add second page_pool for header buffers to each Rx queue and ability
> to toggle the header split on/off using Ethtool (default to off to
> match the current behaviour).
> Unlike idpf, all HW backed up by ice doesn't require any W/As and
> correctly splits all types of packets as configured: after L4 headers
> for TCP/UDP/SCTP, after L3 headers for other IPv4/IPv6 frames, after
> the Ethernet header otherwise (in case of tunneling, same as above,
> but after innermost headers).
> This doesn't affect the XSk path as there are no benefits of having
> it there.
> 
> Signed-off-by: Alexander Lobakin <aleksander.lobakin@...el.com>
> ---
> Applies on top of Tony's next-queue, depends on MichaƂ's Page Pool
> conversion series.
> 
> Sending for review and validation purposes.
> 
> Testing hints: traffic testing with and without header split enabled.
> The header split can be turned on/off using Ethtool:
> 
> sudo ethtool -G <iface> tcp-data-split on (or off)

Nice, I'm very pleased to see this feature in the pipeline for the ice driver.

...

> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c

...

> @@ -836,6 +858,20 @@ bool ice_alloc_rx_bufs(struct ice_rx_ring *rx_ring, unsigned int cleaned_count)
>  		 */
>  		rx_desc->read.pkt_addr = cpu_to_le64(addr);
>  
> +		if (!hdr_fq.pp)
> +			goto next;
> +
> +		addr = libeth_rx_alloc(&hdr_fq, ntu);
> +		if (addr == DMA_MAPPING_ERROR) {
> +			rx_ring->ring_stats->rx_stats.alloc_page_failed++;
> +
> +			libeth_rx_recycle_slow(fq.fqes[ntu].netmem);
> +			break;
> +		}
> +
> +		rx_desc->read.hdr_addr = cpu_to_le64(addr);
> +
> +next:

Is performance the reason that a goto is used here, rather than, say, putting
the conditional code in an if condition? Likewise in ice_clean_rx_irq?

>  		rx_desc++;
>  		ntu++;
>  		if (unlikely(ntu == rx_ring->count)) {
> @@ -933,14 +969,16 @@ static int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
>  		unsigned int size;
>  		u16 stat_err_bits;
>  		u16 vlan_tci;
> +		bool rxe;
>  
>  		/* get the Rx desc from Rx ring based on 'next_to_clean' */
>  		rx_desc = ICE_RX_DESC(rx_ring, ntc);
>  
> -		/* status_error_len will always be zero for unused descriptors
> -		 * because it's cleared in cleanup, and overlaps with hdr_addr
> -		 * which is always zero because packet split isn't used, if the
> -		 * hardware wrote DD then it will be non-zero
> +		/*
> +		 * The DD bit will always be zero for unused descriptors
> +		 * because it's cleared in cleanup or when setting the DMA
> +		 * address of the header buffer, which never uses the DD bit.
> +		 * If the hardware wrote the descriptor, it will be non-zero.
>  		 */

The update to this comment feels like it could be a separate patch.
(I know, I often say something like that...)

>  		stat_err_bits = BIT(ICE_RX_FLEX_DESC_STATUS0_DD_S);
>  		if (!ice_test_staterr(rx_desc->wb.status_error0, stat_err_bits))
> @@ -954,12 +992,27 @@ static int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
>  
>  		ice_trace(clean_rx_irq, rx_ring, rx_desc);
>  
> +		stat_err_bits = BIT(ICE_RX_FLEX_DESC_STATUS0_HBO_S) |
> +				BIT(ICE_RX_FLEX_DESC_STATUS0_RXE_S);
> +		rxe = ice_test_staterr(rx_desc->wb.status_error0,
> +				       stat_err_bits);
> +
> +		if (!rx_ring->hdr_pp)
> +			goto payload;
> +
> +		size = le16_get_bits(rx_desc->wb.hdr_len_sph_flex_flags1,
> +				     ICE_RX_FLEX_DESC_HDR_LEN_M);
> +		if (unlikely(rxe))
> +			size = 0;
> +
> +		rx_buf = &rx_ring->hdr_fqes[ntc];
> +		libeth_xdp_process_buff(xdp, rx_buf, size);
> +		rx_buf->netmem = 0;
> +
> +payload:
>  		size = le16_to_cpu(rx_desc->wb.pkt_len) &
>  			ICE_RX_FLX_DESC_PKT_LEN_M;
> -
> -		stat_err_bits = BIT(ICE_RX_FLEX_DESC_STATUS0_RXE_S);
> -		if (unlikely(ice_test_staterr(rx_desc->wb.status_error0,
> -					      stat_err_bits)))
> +		if (unlikely(rxe))
>  			size = 0;
>  
>  		/* retrieve a buffer from the ring */
> -- 
> 2.51.0
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ