[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c37ae03d-2b0e-4179-ac22-7d0c01ff601d@gmail.com>
Date: Fri, 30 Jan 2026 11:13:35 +0000
From: Pavel Begunkov <asml.silence@...il.com>
To: Bobby Eshleman <bobbyeshleman@...il.com>,
Stanislav Fomichev <stfomichev@...il.com>
Cc: Jakub Kicinski <kuba@...nel.org>, Mina Almasry <almasrymina@...gle.com>,
"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
Kuniyuki Iwashima <kuniyu@...gle.com>, Willem de Bruijn
<willemb@...gle.com>, Neal Cardwell <ncardwell@...gle.com>,
David Ahern <dsahern@...nel.org>, Arnd Bergmann <arnd@...db.de>,
Jonathan Corbet <corbet@....net>, Andrew Lunn <andrew+netdev@...n.ch>,
Shuah Khan <shuah@...nel.org>, Donald Hunter <donald.hunter@...il.com>,
Stanislav Fomichev <sdf@...ichev.me>, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org,
linux-doc@...r.kernel.org, linux-kselftest@...r.kernel.org,
matttbe@...nel.org, skhawaja@...gle.com,
Bobby Eshleman <bobbyeshleman@...a.com>
Subject: Re: [PATCH net-next v10 0/5] net: devmem: improve cpu cost of RX
token management
On 1/27/26 06:48, Bobby Eshleman wrote:
> On Mon, Jan 26, 2026 at 10:00 PM Stanislav Fomichev
> <stfomichev@...il.com> wrote:
>>
>> On 01/26, Jakub Kicinski wrote:
>>> On Mon, 26 Jan 2026 10:45:22 -0800 Bobby Eshleman wrote:
>>>> I'm onboard with improving what we have since it helps all of us
>>>> currently using this API, though I'm not opposed to discussing a
>>>> redesign in another thread/RFC. I do see the attraction to locating the
>>>> core logic in one place and possibly reducing some complexity around
>>>> socket/binding relationships.
>>>>
>>>> FWIW regarding nl, I do see it supports rtnl lock-free operations via
>>>> '62256f98f244 rtnetlink: add RTNL_FLAG_DOIT_UNLOCKED' and routing was
>>>> recently made lockless with that. I don't see / know of any fast path
>>>> precedent. I'm aware there are some things I'm not sure about being
>>>> relevant performance-wise, like hitting skb alloc an additional time
>>>> every release batch. I'd want to do some minimal latency comparisons
>>>> between that path and sockopt before diving head-first.
>>>
>>> FTR I'm not really pushing Netlink specifically, it may work it
>>> may not. Perhaps some other ioctl-y thing exists. Just in general
>>> setsockopt() on a specific socket feels increasingly awkward for
>>> buffer flow. Maybe y'all disagree.
>>>
>>> I thought I'd clarify since I may be seen as "Mr Netlink Everywhere" :)
>>
>> From my side, if we do a completely new uapi, my preference would be on
>> an af_xdp like mapped rings (presumably on a netlink socket?) to completely
>> avoid the user-kernel copies.
>
> I second liking that approach. No put_cmsg() and or token alloc overhead (both
> jump up in my profiling).
Hmm, makes me wonder why not use zcrx instead of reinventing it? It
doesn't bind net_iov to sockets just as you do in this series. And it
also returns buffers back via a shared ring. Otherwise you'll be facing
same issues, like rings running out of space, and so you will need to
have a fallback path. And user space will need to synchronise the ring
if it's shared with other threads, and there will be a question of how
to scale it next, possibly by creating multiple rings as I'll likely to
do soon for zcrx.
--
Pavel Begunkov
Powered by blists - more mailing lists