[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YE+i/VWITCCb37tD@hirez.programming.kicks-ass.net>
Date: Mon, 15 Mar 2021 19:10:05 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Sedat Dilek <sedat.dilek@...il.com>
Cc: Borislav Petkov <bp@...en8.de>, x86@...nel.org,
rostedt@...dmis.org, hpa@...or.com, torvalds@...uxfoundation.org,
linux-kernel@...r.kernel.org, linux-toolchains@...r.kernel.org,
jpoimboe@...hat.com, alexei.starovoitov@...il.com,
mhiramat@...nel.org
Subject: Re: [PATCH 0/2] x86: Remove ideal_nops[]
On Mon, Mar 15, 2021 at 06:04:41PM +0100, Sedat Dilek wrote:
> make V=1 -j4 LLVM=1 LLVM_IAS=1
So for giggles I checked, neither GCC nor LLVM seem to emit prefix NOPs
when building with -march=sandybridge, they always use MOPL.
Furthermore, the kernel explicitly sets: -falign-jumps=1
-falign-loops=1, which, when not specified, default to 16 or so.
This means that your userspace is *littered* with NOPL, even when you
build your entire distro from source with -march=sandybridge.
(arch/gentoo FTW I suppose).
(The only good new is that recent LLVM has a pass to use alternative
instruction encoding in order to grow a basic block in size in order to
minimize the amount of NOP it needs to emit at the end in order to
satisfy the jump/loop alignment.)
So if you *really* deeply care about NOP performance on your SNB, start
by teaching LLVM about prefix NOPs and rebuild your complete userspace.
At that point, you can do some trivial patches to the kernel to make it
use -march=sandybridge and prefix NOPs too.
Until that time, the vast majority of NOPs your CPU will execute will be
NOPL.
Powered by blists - more mailing lists