[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aP-cgMiJ-y_PX7Xa@horms.kernel.org>
Date: Mon, 27 Oct 2025 16:23:28 +0000
From: Simon Horman <horms@...nel.org>
To: Alexander Lobakin <aleksander.lobakin@...el.com>
Cc: intel-wired-lan@...ts.osuosl.org,
Tony Nguyen <anthony.l.nguyen@...el.com>,
Przemek Kitszel <przemyslaw.kitszel@...el.com>,
Andrew Lunn <andrew+netdev@...n.ch>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
nxne.cnse.osdt.itp.upstreaming@...el.com, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH iwl-next] ice: implement configurable header split for
regular Rx
On Mon, Oct 06, 2025 at 06:20:53PM +0200, Alexander Lobakin wrote:
> Add second page_pool for header buffers to each Rx queue and ability
> to toggle the header split on/off using Ethtool (default to off to
> match the current behaviour).
> Unlike idpf, all HW backed up by ice doesn't require any W/As and
> correctly splits all types of packets as configured: after L4 headers
> for TCP/UDP/SCTP, after L3 headers for other IPv4/IPv6 frames, after
> the Ethernet header otherwise (in case of tunneling, same as above,
> but after innermost headers).
> This doesn't affect the XSk path as there are no benefits of having
> it there.
>
> Signed-off-by: Alexander Lobakin <aleksander.lobakin@...el.com>
> ---
> Applies on top of Tony's next-queue, depends on MichaĆ's Page Pool
> conversion series.
>
> Sending for review and validation purposes.
>
> Testing hints: traffic testing with and without header split enabled.
> The header split can be turned on/off using Ethtool:
>
> sudo ethtool -G <iface> tcp-data-split on (or off)
Nice, I'm very pleased to see this feature in the pipeline for the ice driver.
...
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
...
> @@ -836,6 +858,20 @@ bool ice_alloc_rx_bufs(struct ice_rx_ring *rx_ring, unsigned int cleaned_count)
> */
> rx_desc->read.pkt_addr = cpu_to_le64(addr);
>
> + if (!hdr_fq.pp)
> + goto next;
> +
> + addr = libeth_rx_alloc(&hdr_fq, ntu);
> + if (addr == DMA_MAPPING_ERROR) {
> + rx_ring->ring_stats->rx_stats.alloc_page_failed++;
> +
> + libeth_rx_recycle_slow(fq.fqes[ntu].netmem);
> + break;
> + }
> +
> + rx_desc->read.hdr_addr = cpu_to_le64(addr);
> +
> +next:
Is performance the reason that a goto is used here, rather than, say, putting
the conditional code in an if condition? Likewise in ice_clean_rx_irq?
> rx_desc++;
> ntu++;
> if (unlikely(ntu == rx_ring->count)) {
> @@ -933,14 +969,16 @@ static int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
> unsigned int size;
> u16 stat_err_bits;
> u16 vlan_tci;
> + bool rxe;
>
> /* get the Rx desc from Rx ring based on 'next_to_clean' */
> rx_desc = ICE_RX_DESC(rx_ring, ntc);
>
> - /* status_error_len will always be zero for unused descriptors
> - * because it's cleared in cleanup, and overlaps with hdr_addr
> - * which is always zero because packet split isn't used, if the
> - * hardware wrote DD then it will be non-zero
> + /*
> + * The DD bit will always be zero for unused descriptors
> + * because it's cleared in cleanup or when setting the DMA
> + * address of the header buffer, which never uses the DD bit.
> + * If the hardware wrote the descriptor, it will be non-zero.
> */
The update to this comment feels like it could be a separate patch.
(I know, I often say something like that...)
> stat_err_bits = BIT(ICE_RX_FLEX_DESC_STATUS0_DD_S);
> if (!ice_test_staterr(rx_desc->wb.status_error0, stat_err_bits))
> @@ -954,12 +992,27 @@ static int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
>
> ice_trace(clean_rx_irq, rx_ring, rx_desc);
>
> + stat_err_bits = BIT(ICE_RX_FLEX_DESC_STATUS0_HBO_S) |
> + BIT(ICE_RX_FLEX_DESC_STATUS0_RXE_S);
> + rxe = ice_test_staterr(rx_desc->wb.status_error0,
> + stat_err_bits);
> +
> + if (!rx_ring->hdr_pp)
> + goto payload;
> +
> + size = le16_get_bits(rx_desc->wb.hdr_len_sph_flex_flags1,
> + ICE_RX_FLEX_DESC_HDR_LEN_M);
> + if (unlikely(rxe))
> + size = 0;
> +
> + rx_buf = &rx_ring->hdr_fqes[ntc];
> + libeth_xdp_process_buff(xdp, rx_buf, size);
> + rx_buf->netmem = 0;
> +
> +payload:
> size = le16_to_cpu(rx_desc->wb.pkt_len) &
> ICE_RX_FLX_DESC_PKT_LEN_M;
> -
> - stat_err_bits = BIT(ICE_RX_FLEX_DESC_STATUS0_RXE_S);
> - if (unlikely(ice_test_staterr(rx_desc->wb.status_error0,
> - stat_err_bits)))
> + if (unlikely(rxe))
> size = 0;
>
> /* retrieve a buffer from the ring */
> --
> 2.51.0
>
Powered by blists - more mailing lists