lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <7afd8717-4b3a-2104-3581-4cf3440be0f8@bootlin.com>
Date: Tue, 9 Jan 2024 16:16:49 +0100 (CET)
From: Romain Gantois <romain.gantois@...tlin.com>
To: Vladimir Oltean <vladimir.oltean@....com>
cc: Romain Gantois <romain.gantois@...tlin.com>, 
    Alexandre Torgue <alexandre.torgue@...s.st.com>, 
    Jose Abreu <joabreu@...opsys.com>, "David S. Miller" <davem@...emloft.net>, 
    Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, 
    Paolo Abeni <pabeni@...hat.com>, 
    Maxime Coquelin <mcoquelin.stm32@...il.com>, 
    Miquel Raynal <miquel.raynal@...tlin.com>, 
    Maxime Chevallier <maxime.chevallier@...tlin.com>, 
    Sylvain Girard <sylvain.girard@...com>, 
    Pascal EBERHARD <pascal.eberhard@...com>, 
    Richard Tresidder <rtresidd@...ctromag.com.au>, 
    Linus Walleij <linus.walleij@...aro.org>, 
    Florian Fainelli <f.fainelli@...il.com>, Andrew Lunn <andrew@...n.ch>, 
    netdev@...r.kernel.org, linux-stm32@...md-mailman.stormreply.com, 
    linux-arm-kernel@...ts.infradead.org, stable@...r.kernel.org
Subject: Re: [PATCH net v3 1/1] net: stmmac: Prevent DSA tags from breaking
 C

On Mon, 8 Jan 2024, Vladimir Oltean wrote:

> On Mon, Jan 08, 2024 at 03:23:38PM +0100, Romain Gantois wrote:
> > I see, the kernel docs were indeed enlightening on this point. As a side note, 
> > I've just benchmarked both the "with-inline" and "without-inline" versions. 
> > First of all, objdump seems to confirm that GCC does indeed follow this pragma 
> > in this particular case. Also, RX perfs are better with stmmac_has_ip_ethertype 
> > inlined, but TX perfs are actually consistently worse with this function 
> > inlined, which could very well be caused by cache effects.
> > 
> > In any case, I think it is better to remove the "inline" pragma as you said. 
> > I'll do that in v4.
> 
> Are you doing any code instrumentation, or just measuring the results
> and deducing what might cause them?
> 
> It might be worth looking at the perf events and seeing what function
> consumes the most amount of time.
> 
> CPU_CORE=0
> perf record -e cycles -C $CPU_CORE sleep 10 && perf report
> perf record -e cache-misses -C $CPU_CORE sleep 10 && perf report
> 

Unfortunately my hardware doesn't support these performance metrics, but I did 
manage to do some instrumentation with the ftrace profiler:

Same test conditions as before, 10 second iperf3 runs with unfragmented UDP 
packets.

no inline TX
  average time per call for stmmac_xmit(): 85us
  average time per call for stmmac_has_ip_ethertype(): 2us

no inline RX
  average time per call for stmmac_napi_poll_rx(): 8142us
  average time per call for stmmac_has_ip_ethertype(): 2us

inline TX:
  average time per call for stmmac_xmit(): 85us

inline RX:
  average time per call for stmmac_napi_poll_rx(): 8410us

It seems like this time, RX performed slightly worse with the function inline. 
To be honest, I'm starting to doubt the reproducibility of these tests. In any 
case it seems better to just remove the "inline" and let gcc do the optimizing.

Best Regards,

-- 
Romain Gantois, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ