lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID:
 <PH0PR12MB5481383E1487D7EA976BB085DC489@PH0PR12MB5481.namprd12.prod.outlook.com>
Date: Wed, 31 May 2023 17:07:19 +0000
From: Parav Pandit <parav@...dia.com>
To: Eric Dumazet <edumazet@...gle.com>, David Ahern <dsahern@...nel.org>
CC: Jakub Kicinski <kuba@...nel.org>, "davem@...emloft.net"
	<davem@...emloft.net>, "pabeni@...hat.com" <pabeni@...hat.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [PATCH net-next] net: Make gro complete function to return void


> From: Eric Dumazet <edumazet@...gle.com>
> Sent: Wednesday, May 31, 2023 12:39 AM
> 
> On Wed, May 31, 2023 at 12:36 AM David Ahern <dsahern@...nel.org> wrote:
> >
> > On 5/30/23 1:39 PM, Jakub Kicinski wrote:
> > > On Tue, 30 May 2023 17:48:22 +0200 Eric Dumazet wrote:
> > >>> tcp_gro_complete seems fairly trivial. Any reason not to make it
> > >>> an inline and avoid another function call in the datapath?
> > >>
> > >> Probably, although it is a regular function call, not an indirect one.
> > >>
> > >> In the grand total of driver rx napi + GRO cost, saving a few
> > >> cycles per GRO completed packet is quite small.
> > >
> > > IOW please make sure you include the performance analysis
> > > quantifying the win, if you want to make this a static inline. Or
> > > let us know if the patch is good as is, I'm keeping it in pw for now.
> >
> > I am not suggesting holding up this patch; just constantly looking for
> > these little savings here and there to keep lowering the overhead.
> >
> > 100G, 1500 MTU, line rate is 8.3M pps so GRO wise that would be ~180k
> > fewer function calls.
> 
> Here with 4K MTU, this is called 67k per second
> 
> An __skb_put() instead of skb_put() in a driver (eg mlx5e_build_linear_skb())
> would have 45x more impact, and would still be noise.

Thanks, Eric, for the suggestion, will evaluate with Tariq to use __skb_put().

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ