lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 24 May 2007 13:40:54 -0400
From:	Rob Landley <rob@...dley.net>
To:	Jan Engelhardt <jengelh@...ux01.gwdg.de>
Cc:	Adrian Bunk <bunk@...sta.de>,
	Arjan van de Ven <arjan@...ux.intel.com>,
	linux-kernel@...r.kernel.org
Subject: Re: Status of CONFIG_FORCED_INLINING?

On Thursday 24 May 2007 12:29 pm, Jan Engelhardt wrote:
> 
> On May 23 2007 23:22, Adrian Bunk wrote:
> >
> >And we need only two different inline levels (__always_inline and
> >"let the compiler decide"), not three (__always_inline, inline and
> >"let the compiler decide").
> 
> "inline" is "let the compiler decide".

The compiler decides anyway.  That's why we need a noinline to tell it _not_ 
to spontaneously inline things it shouldn't.

> If it is not, then it is "let the 
> compiler decide, based on my bias that I think it should be inlined".

Things like unlikely() actually produce different code (and are, technically 
speaking, workarounds for the high branch misprediction penalty in modern 
processor designs).

Things like "register" are a hint to the compiler that it's free to ignore.  
How often do we use the "register" keyword in Linux?  Register allocation is 
totally different on different processors, and deprived of context the hint 
is almomst meaningless.  If it doesn't guarantee anything and has far less of 
a performance impact than the difference between -O2 and -Os (or between gcc 
3.4 and 4.1), then it's probably an unnecessary complication.

Inlining is even worse because whether or not it's a performance win depends 
on the L1 cache size, cache line size and alignment, L2 cache, DRAM fetch 
latency, and so on.  This varies all over the map within a given processor 
line, let alone between processors.

If we actually need something inlined, we can tell it to do so reliably with 
__always_inline (and change __always_inline to be MORE explicit each time gcc 
breaks the previous way of saying "yes really inline it dammit").  If we can 
live with it not being inlined, why not leave it to the compiler whether or 
not to inline it?

Rob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ