lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
 <SA3PR21MB38673CA4DDE618A5D9C4FA99CA8CA@SA3PR21MB3867.namprd21.prod.outlook.com>
Date: Thu, 15 Jan 2026 19:57:44 +0000
From: Haiyang Zhang <haiyangz@...rosoft.com>
To: Jakub Kicinski <kuba@...nel.org>
CC: Haiyang Zhang <haiyangz@...ux.microsoft.com>,
	"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>, KY Srinivasan
	<kys@...rosoft.com>, Wei Liu <wei.liu@...nel.org>, Dexuan Cui
	<DECUI@...rosoft.com>, Long Li <longli@...rosoft.com>, Andrew Lunn
	<andrew+netdev@...n.ch>, "David S. Miller" <davem@...emloft.net>, Eric
 Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>, Konstantin
 Taranov <kotaranov@...rosoft.com>, Simon Horman <horms@...nel.org>, Erni Sri
 Satya Vennela <ernis@...ux.microsoft.com>, Shradha Gupta
	<shradhagupta@...ux.microsoft.com>, Saurabh Sengar
	<ssengar@...ux.microsoft.com>, Aditya Garg <gargaditya@...ux.microsoft.com>,
	Dipayaan Roy <dipayanroy@...ux.microsoft.com>, Shiraz Saleem
	<shirazsaleem@...rosoft.com>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "linux-rdma@...r.kernel.org"
	<linux-rdma@...r.kernel.org>, Paul Rosswurm <paulros@...rosoft.com>
Subject: RE: [EXTERNAL] Re: [PATCH V2,net-next, 1/2] net: mana: Add support
 for coalesced RX packets on CQE



> -----Original Message-----
> From: Jakub Kicinski <kuba@...nel.org>
> Sent: Wednesday, January 14, 2026 9:55 PM
> To: Haiyang Zhang <haiyangz@...rosoft.com>
> Cc: Haiyang Zhang <haiyangz@...ux.microsoft.com>; linux-
> hyperv@...r.kernel.org; netdev@...r.kernel.org; KY Srinivasan
> <kys@...rosoft.com>; Wei Liu <wei.liu@...nel.org>; Dexuan Cui
> <DECUI@...rosoft.com>; Long Li <longli@...rosoft.com>; Andrew Lunn
> <andrew+netdev@...n.ch>; David S. Miller <davem@...emloft.net>; Eric
> Dumazet <edumazet@...gle.com>; Paolo Abeni <pabeni@...hat.com>; Konstantin
> Taranov <kotaranov@...rosoft.com>; Simon Horman <horms@...nel.org>; Erni
> Sri Satya Vennela <ernis@...ux.microsoft.com>; Shradha Gupta
> <shradhagupta@...ux.microsoft.com>; Saurabh Sengar
> <ssengar@...ux.microsoft.com>; Aditya Garg
> <gargaditya@...ux.microsoft.com>; Dipayaan Roy
> <dipayanroy@...ux.microsoft.com>; Shiraz Saleem
> <shirazsaleem@...rosoft.com>; linux-kernel@...r.kernel.org; linux-
> rdma@...r.kernel.org; Paul Rosswurm <paulros@...rosoft.com>
> Subject: Re: [EXTERNAL] Re: [PATCH V2,net-next, 1/2] net: mana: Add
> support for coalesced RX packets on CQE
> 
> On Wed, 14 Jan 2026 18:27:50 +0000 Haiyang Zhang wrote:
> > > > And, the coalescing can add up to 2 microseconds into one-way
> latency.
> > >
> > > I am asking you how the _device_ (hypervisor?) decides when to
> coalesce
> > > and when to send a partial CQE (<4 packets in 4 pkt CQE). You are
> using
> > > the coalescing uAPI, so I'm trying to make sure this is the correct
> API.
> > > CQE configuration can also be done via ringparam.
> >
> > When coalescing is enabled, the device waits for packets which can
> > have the CQE coalesced with previous packet(s). That coalescing process
> > is finished (and a CQE written to the appropriate CQ) when the CQE is
> > filled with 4 pkts, or time expired, or other device specific logic is
> > satisfied.
> 
> See, what I'm afraid is happening here is that you are enabling
> completion coalescing (how long the device keeps the CQE pending).
> Which is _not_ what rx_max_coalesced_frames controls for most NICs.
> For most NICs rx_max_coalesced_frames controls IRQ generation logic.
> 
> The NIC first buffers up CQEs for typically single digit usecs, and
> then once CQE timer exipred and writeback happened it starts an IRQ
> coalescing timer. Once the IRQ coalescing timer expires IRQ is
> triggered, which schedules NAPI. (broad strokes, obviously many
> differences and optimizations exist)
> 
> Is my guess correct? Are you controlling CQE coalescing>
> 
> Can you control the timeout instead of the frame count?

Our NIC's timeout value cannot be controlled by driver. Also, the
timeout may be changed in future NIC HW.

So, I use the ethtool/rx-frames, which is either 1 or 4 on our
NIC, to switch the CQE coalescing feature on/off.

Thanks,
- Haiyang


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ