[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aQ30WmbN_O60vEzl@lore-desk>
Date: Fri, 7 Nov 2025 14:30:02 +0100
From: Lorenzo Bianconi <lorenzo@...nel.org>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Eric Dumazet <edumazet@...gle.com>, Andrew Lunn <andrew+netdev@...n.ch>,
"David S. Miller" <davem@...emloft.net>,
Paolo Abeni <pabeni@...hat.com>,
linux-arm-kernel@...ts.infradead.org,
linux-mediatek@...ts.infradead.org, netdev@...r.kernel.org
Subject: Re: [PATCH net-next] net: airoha: Add TCP LRO support
> On Fri, 31 Oct 2025 09:42:15 +0100 Lorenzo Bianconi wrote:
> > > > Hm, truesize is the buffer size, right? If the driver allocated n bytes
> > > > of memory for packets it sent up the stack, the truesizes of the skbs
> > > > it generated must add up to approximately n bytes.
> > >
> > > With 'truesize' I am referring to the real data size contained in the x-order
> > > page returned by the hw. If this size is small, I was thinking to just allocate
> > > a skb for it, copy the data from the x-order page into it and re-insert the
> > > x-order page into the page_pool running page_pool_put_full_page().
> > > Let me do some tests with order-2 page to see if the GRO can compensate the
> > > reduced page size.
> >
> > Sorry for the late reply about this item.
> > I carried out some comparison tests between GRO-only and GRO+LRO with order-2
> > pages [0]. The system is using a 2.5Gbps link. The device is receiving a single TCP
> > stream. MTU is set to 1500B.
> >
> > - GRO only: ~1.6Gbps
> > - GRO+LRO (order-2 pages): ~2.1Gbps
> >
> > In both cases we can't reach the line-rate. Do you think the difference can justify
> > the hw LRO support? Thanks in advance.
> >
> > [0] the hw LRO requires contiguous memory pages to work. I reduced the size to
> > order-2 from order-5 (original implementation).
>
> I think we're mostly advising about real world implications of
> the approach rather than nacking. I can't say for sure if potentially
> terrible skb->len/skb->truesize ratio will matter for a router
> application. Maybe not.
>
> BTW is the device doing header-data split or the LRO frame has headers
> and payload in a single buffer?
According to my understanding the hw LRO is limited to a single order-x page
containing both the headers and the payload (the hw LRO module is not capable
of splitting the aggregated TCP segment over multiple pages).
What we could do is disable hw LRO by default and feed hw rx queues with
order-0 pages (current implementation). If the user enables hw LRO, we will
free order-0 pages linked to the rx DMA descriptors and allocate order-x pages
(e.g. order-2) for hw LRO queues. Disabling hw LRO will switch back to order-0
pages.
Regards,
Lorenzo
Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)
Powered by blists - more mailing lists