lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0706250010360.6215@asgard.lang.hm>
Date:	Mon, 25 Jun 2007 00:13:46 -0700 (PDT)
From:	david@...g.hm
To:	Segher Boessenkool <segher@...nel.crashing.org>
cc:	Benjamin LaHaise <bcrl@...ck.org>, linux-kernel@...r.kernel.org,
	Arjan van de Ven <arjan@...radead.org>,
	Adrian Bunk <bunk@...sta.de>,
	Oleg Verych <olecom@...wer.upol.cz>, rae l <crquan@...il.com>
Subject: Re: -Os versus -O2

On Mon, 25 Jun 2007, Segher Boessenkool wrote:

>>  then do we need a new option 'optimize for best overall performance' that
>>  goes for size (and the corresponding wins there) most of the time, but is
>>  ignored where it makes a huge difference?
>
> That's -Os mostly.  Some awful CPUs really need higher
> loop/label/function alignment though to get any
> performance; you could add -falign-xxx options for those.
>
>>  in reality this was a flaw in gcc that on modern CPU's with the larger
>>  difference between CPU speed and memory speed it still preferred to unroll
>>  loops (eating more memory and blowing out the cpu cache) when it shouldn't
>>  have.
>
> You told it to unroll loops, so it did.  No flaw.  If you
> feel the optimisations enabled by -O2 should depend on the
> CPU tuning selected, please file a PR.
>
> Also note that whether or not it is profitable to unroll
> a particular loop depends largely on how "hot" that loop
> is, and GCC doesn't know much about that if you don't feed
> it profiling information (it can guess a bit, sure, but it
> can guess wrong too).

actually, what you are saying is that the compiler can't know enough to 
figure out how to optimize for speed. it will just do what you tell it to, 
either unroll loops or not.

this argues that both O2 and Os are incorrect for a project to use and 
instead the project needs to make it's own decisions on this.

if this is the true feeling of the gcc team I'm very disappointed, it 
feels like a huge step backwards.

David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ