lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1415286838.24404.35.camel@localhost>
Date:	Thu, 06 Nov 2014 16:13:58 +0100
From:	Hannes Frederic Sowa <hannes@...essinduktion.org>
To:	David Miller <davem@...emloft.net>
Cc:	netdev@...r.kernel.org, kernel@...r.kernel.org,
	dborkman@...hat.com, tgraf@...g.ch
Subject: Re: [PATCH net-next] fast_hash: avoid indirect function calls

On Mi, 2014-11-05 at 22:03 -0500, David Miller wrote:
> > Would it make sense to start suppressing the generation of local
> > functions for static inline functions which address is taken?
> > 
> > E.g. we could use extern inline in a few cases (dst_output is often used
> > as a function pointer but marked static inline).  We could mark it as
> > extern inline and copy&paste the code to a .c file to prevent multiple
> > copies of machine code for this function. But because of the copy&paste I
> > did not in this case.
> 
> I'd say that perhaps dst_output() can be handled in the "traditional"
> way, by not inlining it ever.

Yes, that sounds sane. dst_output (8 copies), dst_discard (6 copies)
seem to be good candidates but also won't change much as they are
trivially short. Fast path mostly uses them as function pointers, so we
shouldn't see any slowdown here.

I figured out that extern inlining functions and copy&pasting the code
does only make sense for functions which don't depend on static inline
functions, so this option to reduce code bloat is absolutely useless.

Btw. our most duplicated function in an allyesconfig-build (not size
optimized) is netif(_tx)_stop_queue with 30 (or 180 if -Os) copies
(netif_wake_queue 135 copies with -Os, not visible in -O2).

> If we have indirect function invocations and non-direct inlines, maybe
> in the end it's better to have it in a single hot cache location, no?

I am not sure and this very much depends on the cpu, I think. But
reducing icache pressure is always a good thing and leads to better
performance after all, so in general I would agree. Also, unconditional
direct branches should be very fast nowadays.

But if we care about size I wouldn't touch the code and hope stuff like
gnu gold's ICF (identical code folding) or gnu ld --gc-sections in
combination with --ffunction-sections and --fdata-sections will be used
by the kernel some day to eliminate copies of duplicate functions.

Bye,
Hannes



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists