[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251014184119.3ba2dd70@kernel.org>
Date: Tue, 14 Oct 2025 18:41:19 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Mina Almasry <almasrymina@...gle.com>
Cc: Pavel Begunkov <asml.silence@...il.com>, netdev@...r.kernel.org, Andrew
Lunn <andrew@...n.ch>, davem@...emloft.net, Eric Dumazet
<edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>, Simon Horman
<horms@...nel.org>, Donald Hunter <donald.hunter@...il.com>, Michael Chan
<michael.chan@...adcom.com>, Pavan Chebbi <pavan.chebbi@...adcom.com>,
Jesper Dangaard Brouer <hawk@...nel.org>, John Fastabend
<john.fastabend@...il.com>, Stanislav Fomichev <sdf@...ichev.me>, Joshua
Washington <joshwash@...gle.com>, Harshitha Ramamurthy
<hramamurthy@...gle.com>, Jian Shen <shenjian15@...wei.com>, Salil Mehta
<salil.mehta@...wei.com>, Jijie Shao <shaojijie@...wei.com>, Sunil Goutham
<sgoutham@...vell.com>, Geetha sowjanya <gakula@...vell.com>, Subbaraya
Sundeep <sbhatta@...vell.com>, hariprasad <hkelam@...vell.com>, Bharat
Bhushan <bbhushan2@...vell.com>, Saeed Mahameed <saeedm@...dia.com>, Tariq
Toukan <tariqt@...dia.com>, Mark Bloch <mbloch@...dia.com>, Leon Romanovsky
<leon@...nel.org>, Alexander Duyck <alexanderduyck@...com>,
kernel-team@...a.com, Ilias Apalodimas <ilias.apalodimas@...aro.org>, Joe
Damato <joe@...a.to>, David Wei <dw@...idwei.uk>, Willem de Bruijn
<willemb@...gle.com>, Breno Leitao <leitao@...ian.org>, Dragos Tatulea
<dtatulea@...dia.com>, linux-kernel@...r.kernel.org,
linux-doc@...r.kernel.org, linux-rdma@...r.kernel.org, Jonathan Corbet
<corbet@....net>
Subject: Re: [PATCH net-next v4 00/24][pull request] Queue configs and large
buffer providers
On Mon, 13 Oct 2025 21:41:38 -0700 Mina Almasry wrote:
> > I'd like to rework these a little bit.
> > On reflection I don't like the single size control.
> > Please hold off.
>
> FWIW when I last looked at this I didn't like that the size control
> seemed to control the size of the allocations made from the pp, but
> not the size actually posted to the NIC.
>
> I.e. in the scenario where the driver fragments each pp buffer into 2,
> and the user asks for 8K rx-buf-len, the size actually posted to the
> NIC would have actually been 4K (8K / 2 for 2 fragments).
>
> Not sure how much of a concern this really is. I thought it would be
> great if somehow rx-buf-len controlled the buffer sizes actually
> posted to the NIC, because that what ultimately matters, no (it ends
> up being the size of the incoming frags)? Or does that not matter for
> some reason I'm missing?
I spent a couple of hours trying to write up my thoughts but I still haven't
finished 😅️ I'll send the full thing tomorrow.
You may have looked at hns3 is that right? It bumps the page pool order
by 1 so that it can fit two allocations into each page. I'm guessing
it's a remnant of "page flipping". The other current user of rx-buf-len
(otx2) doesn't do that - it uses simple page_order(rx_buf_len), AFAICT.
If that's what you mean - I'd chalk the hns3 behavior to "historical
reasons", it can probably be straightened out today to everyone's
benefit.
I wanted to reply already (before I present my "full case" :)) because
my thinking started slipping in the opposite direction of being
concerned about "buffer sizes actually posted to the NIC".
Say the NIC packs packet payloads into buffers like this:
1 2 3
packets: xxxxxxxxx yyyy zzzzzzz
buffers: [xxxx] [xxxx] [x|yyy] [y|zzz] [zzzz]
Hope the diagram makes sense, each [....] is 4k, headers went elsewhere.
If the user filled in the page pool with 16k buffers, and driver split
it up into 4k chunks. HW packed the payloads into those 4k chunks,
and GRO reformed them back into just 2 skb frags. Do we really care
about the buffer size on the HW fill ring being 4kB ? Isn't what user
cares about that they saw 2 frags not 5 ?
Powered by blists - more mailing lists