[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wj-m=Oo5jy_+0ojXahT6bb_HvuU5=Lo0iALsjruo8j6Dg@mail.gmail.com>
Date: Fri, 22 Jul 2022 15:18:33 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Tim Chen <tim.c.chen@...ux.intel.com>
Cc: Thomas Gleixner <tglx@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>,
"the arch/x86 maintainers" <x86@...nel.org>,
Josh Poimboeuf <jpoimboe@...nel.org>,
Andrew Cooper <Andrew.Cooper3@...rix.com>,
Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
Johannes Wikner <kwikner@...z.ch>,
Alyssa Milburn <alyssa.milburn@...ux.intel.com>,
Jann Horn <jannh@...gle.com>, "H.J. Lu" <hjl.tools@...il.com>,
Joao Moreira <joao.moreira@...el.com>,
Joseph Nuzman <joseph.nuzman@...el.com>,
Steven Rostedt <rostedt@...dmis.org>,
Juergen Gross <jgross@...e.com>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
Masami Hiramatsu <mhiramat@...nel.org>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>
Subject: Re: [patch 00/38] x86/retbleed: Call depth tracking mitigation
On Fri, Jul 22, 2022 at 1:11 PM Tim Chen <tim.c.chen@...ux.intel.com> wrote:
>
> Here are some performance numbers for FIO running on a SKX server with
> Intel Cold Stream SSD. Padding improves performance significantly.
That certainly looks oh-so-much better than those disgusting ibrs numbers.
One thing that I wonder about - gcc already knows about leaf functions
for other reasons (stack setup is often different anyway), and I
wonder it it might be worth looking at making leaf functions avoid the
whole depth counting, and just rely on a regular call/ret.
The whole call chain thing is already not entirely exact and is
counting to a smaller value than the real RSB size.
And leaf functions are generally the smallest and most often called,
so it might be noticeable on profiles and performance numbers to just
say "we already know this is a leaf, there's no reason to increment
the depth for this only to decrement it when it returns".
The main issue is obviously that the return instruction has to be a
non-decrementing one too for the leaf function case, so it's not just
"don't do the counting on entry", it's also a "don't do the usual
rethunk on exit".
So I just wanted to raise this, not because it's hugely important, but
just to make people think about it - I have these very clear memories
of the whole "don't make leaf functions create a stack frame" having
been a surprisingly big deal for some loads.
Of course, sometimes when I have clear memories, they turn out to be
just some drug-induced confusion in the end. But I know people
experimented with "-fno-omit-frame-pointer -momit-leaf-frame-pointer"
and that it made a big difference (but caused some issue with pvops
hidden in asm so that gcc incorrectly thought they were leaf functions
when they weren't).
Linus
Powered by blists - more mailing lists