[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ed3c08d2-04ba-217e-9924-28cab7750234@csgroup.eu>
Date: Fri, 5 Mar 2021 07:38:25 +0100
From: Christophe Leroy <christophe.leroy@...roup.eu>
To: Segher Boessenkool <segher@...nel.crashing.org>,
Nick Desaulniers <ndesaulniers@...gle.com>
Cc: Mark Rutland <mark.rutland@....com>,
Marco Elver <elver@...gle.com>,
Catalin Marinas <catalin.marinas@....com>,
linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>,
LKML <linux-kernel@...r.kernel.org>,
kasan-dev <kasan-dev@...glegroups.com>,
Mark Brown <broonie@...nel.org>,
Paul Mackerras <paulus@...ba.org>,
linux-toolchains@...r.kernel.org, Will Deacon <will@...nel.org>,
Linux ARM <linux-arm-kernel@...ts.infradead.org>
Subject: Re: [PATCH v1] powerpc: Include running function as first entry in
save_stack_trace() and friends
Le 04/03/2021 à 20:24, Segher Boessenkool a écrit :
> On Thu, Mar 04, 2021 at 09:54:44AM -0800, Nick Desaulniers wrote:
>> On Thu, Mar 4, 2021 at 9:42 AM Marco Elver <elver@...gle.com> wrote:
>> include/linux/compiler.h:246:
>> prevent_tail_call_optimization
>>
>> commit a9a3ed1eff36 ("x86: Fix early boot crash on gcc-10, third try")
https://github.com/linuxppc/linux/commit/a9a3ed1eff36
>
> That is much heavier than needed (an mb()). You can just put an empty
> inline asm after a call before a return, and that call cannot be
> optimised to a sibling call: (the end of a function is an implicit
> return:)
>
> Instead of:
>
> void g(void);
> void f(int x)
> if (x)
> g();
> }
>
> Do:
>
> void g(void);
> void f(int x)
> if (x)
> g();
> asm("");
> }
>
> This costs no extra instructions, and certainly not something as heavy
> as an mb()! It works without the "if" as well, of course, but with it
> it is a more interesting example of a tail call.
In the commit mentionned at the top, it is said:
The next attempt to prevent compilers from tail-call optimizing
the last function call cpu_startup_entry(), ... , was to add an empty asm("").
This current solution was short and sweet, and reportedly, is supported
by both compilers but we didn't get very far this time: future (LTO?)
optimization passes could potentially eliminate this, which leads us
to the third attempt: having an actual memory barrier there which the
compiler cannot ignore or move around etc.
Christophe
Powered by blists - more mailing lists