[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<SA3PR21MB3867D18555258EDB7FCF9ACACA8AA@SA3PR21MB3867.namprd21.prod.outlook.com>
Date: Sat, 17 Jan 2026 18:01:18 +0000
From: Haiyang Zhang <haiyangz@...rosoft.com>
To: Jakub Kicinski <kuba@...nel.org>
CC: Haiyang Zhang <haiyangz@...ux.microsoft.com>,
"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>, KY Srinivasan
<kys@...rosoft.com>, Wei Liu <wei.liu@...nel.org>, Dexuan Cui
<DECUI@...rosoft.com>, Long Li <longli@...rosoft.com>, Andrew Lunn
<andrew+netdev@...n.ch>, "David S. Miller" <davem@...emloft.net>, Eric
Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>, Konstantin
Taranov <kotaranov@...rosoft.com>, Simon Horman <horms@...nel.org>, Erni Sri
Satya Vennela <ernis@...ux.microsoft.com>, Shradha Gupta
<shradhagupta@...ux.microsoft.com>, Saurabh Sengar
<ssengar@...ux.microsoft.com>, Aditya Garg <gargaditya@...ux.microsoft.com>,
Dipayaan Roy <dipayanroy@...ux.microsoft.com>, Shiraz Saleem
<shirazsaleem@...rosoft.com>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "linux-rdma@...r.kernel.org"
<linux-rdma@...r.kernel.org>, Paul Rosswurm <paulros@...rosoft.com>
Subject: RE: [EXTERNAL] Re: [PATCH V2,net-next, 1/2] net: mana: Add support
for coalesced RX packets on CQE
> -----Original Message-----
> From: Jakub Kicinski <kuba@...nel.org>
> Sent: Saturday, January 17, 2026 11:59 AM
> To: Haiyang Zhang <haiyangz@...rosoft.com>
> Cc: Haiyang Zhang <haiyangz@...ux.microsoft.com>; linux-
> hyperv@...r.kernel.org; netdev@...r.kernel.org; KY Srinivasan
> <kys@...rosoft.com>; Wei Liu <wei.liu@...nel.org>; Dexuan Cui
> <DECUI@...rosoft.com>; Long Li <longli@...rosoft.com>; Andrew Lunn
> <andrew+netdev@...n.ch>; David S. Miller <davem@...emloft.net>; Eric
> Dumazet <edumazet@...gle.com>; Paolo Abeni <pabeni@...hat.com>; Konstantin
> Taranov <kotaranov@...rosoft.com>; Simon Horman <horms@...nel.org>; Erni
> Sri Satya Vennela <ernis@...ux.microsoft.com>; Shradha Gupta
> <shradhagupta@...ux.microsoft.com>; Saurabh Sengar
> <ssengar@...ux.microsoft.com>; Aditya Garg
> <gargaditya@...ux.microsoft.com>; Dipayaan Roy
> <dipayanroy@...ux.microsoft.com>; Shiraz Saleem
> <shirazsaleem@...rosoft.com>; linux-kernel@...r.kernel.org; linux-
> rdma@...r.kernel.org; Paul Rosswurm <paulros@...rosoft.com>
> Subject: Re: [EXTERNAL] Re: [PATCH V2,net-next, 1/2] net: mana: Add
> support for coalesced RX packets on CQE
>
> On Fri, 16 Jan 2026 16:44:33 +0000 Haiyang Zhang wrote:
> > > You need to add a new param to the uAPI.
> >
> > Since this feature is not common to other NICs, can we use an
> > ethtool private flag instead?
>
> It's extremely common. Descriptor writeback at the granularity of one
> packet would kill PCIe performance. We just don't have uAPI so NICs
> either don't expose the knob or "reuse" another coalescing param.
I see. So how about adding a new param like below to "ethtool -C"?
ethtool -C|--coalesce devname [rx-cqe-coalesce on|off]
> > When the flag is set, the CQE coalescing will be enabled and put
> > up to 4 pkts in a CQE.
> >
> > > Please add both size and
> > > timeout. Expose the timeout as read only if your device doesn't
> support
> > > controlling it per queue.
> >
> > Does the "size" mean the max pks per CQE (1 or 4)?
>
> The definition of "size" is always a little funny when it comes to
> coalescing and ringparam. In Tx does one frame mean one wire frame
> or one TSO superframe? I wouldn't worry about the exact meaning of
> size too much. Important thing is that user knows what making this
> param smaller or larger will do.
In "ethtool -c" output, add a new value like this?
rx-cqe-frames: (1 or 4 frames/CQE for this NIC)
> > The timeout value is not even exposed to driver, and subject to change
> > in the future. Also the HW mechanism is proprietary... So, can we not
> > "expose" the timeout value in "ethtool -c" outputs, because it's not
> > available at driver level?
>
> Add it to the FW API and have FW send the current value to the driver?
I don't know where is the timeout value in the HW / FW layers. Adding
new info to the HW/FW API needs other team's approval, and their work,
which will need a complex process and a long time.
> You were concerned (in the commit msg) that there's a latency cost,
> which is fair but I think for 99% of users 2usec is absolutely
> not detectable (it takes longer for the CPU to wake). So I think it'd
> be very valuable to the user to understand the order of magnitude of
> latency we're talking about here.
For now, may I document the 2us in the patch description? And add a
new item to the "ethtool -c" output, like "rx-cqe-usecs", label is as
"n/a" for now, while we work out with other teams on the time value
API at HW/FW layers? So, this CQE coalescing feature support won't be
blocked by this "2usec" info API for a long time?
Thanks,
- Haiyang
Powered by blists - more mailing lists