[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <eb15f2dd-92f3-423d-e22f-3d1528fb7b6e@rasmusvillemoes.dk>
Date: Fri, 30 Nov 2018 23:16:34 +0100
From: Rasmus Villemoes <linux@...musvillemoes.dk>
To: Josh Poimboeuf <jpoimboe@...hat.com>,
Steven Rostedt <rostedt@...dmis.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Andy Lutomirski <luto@...capital.net>,
Peter Zijlstra <peterz@...radead.org>,
Andrew Lutomirski <luto@...nel.org>,
the arch/x86 maintainers <x86@...nel.org>,
Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
Ard Biesheuvel <ard.biesheuvel@...aro.org>,
Ingo Molnar <mingo@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>, mhiramat@...nel.org,
jbaron@...mai.com, Jiri Kosina <jkosina@...e.cz>,
David.Laight@...lab.com, bp@...en8.de, julia@...com,
jeyu@...nel.org, Peter Anvin <hpa@...or.com>
Subject: Re: [PATCH v2 4/4] x86/static_call: Add inline static call
implementation for x86-64
On 29/11/2018 20.22, Josh Poimboeuf wrote:
> On Thu, Nov 29, 2018 at 02:16:48PM -0500, Steven Rostedt wrote:
>>> and honestly, the way "static_call()" works now, can you guarantee
>>> that the call-site doesn't end up doing that, and calling the
>>> trampoline function for two different static calls from one indirect
>>> call?
>>>
>>> See what I'm talking about? Saying "callers are wrapped in macros"
>>> doesn't actually protect you from the compiler doing things like that.
>>>
>>> In contrast, if the call was wrapped in an inline asm, we'd *know* the
>>> compiler couldn't turn a "call wrapper(%rip)" into anything else.
>>
>> But then we need to implement all numbers of parameters.
>
> I actually have an old unfinished patch which (ab)used C macros to
> detect the number of parameters and then setup the asm constraints
> accordingly. At the time, the goal was to optimize the BUG code.
>
> I had wanted to avoid this kind of approach for static calls, because
> "ugh", but now it's starting to look much more appealing.
>
> Behold:
>
> diff --git a/arch/x86/include/asm/bug.h b/arch/x86/include/asm/bug.h
> index aa6b2023d8f8..d63e9240da77 100644
> --- a/arch/x86/include/asm/bug.h
> +++ b/arch/x86/include/asm/bug.h
> @@ -32,10 +32,59 @@
>
> #ifdef CONFIG_DEBUG_BUGVERBOSE
>
> -#define _BUG_FLAGS(ins, flags) \
> +#define __BUG_ARGS_0(ins, ...) \
> +({\
> + asm volatile("1:\t" ins "\n"); \
> +})
> +#define __BUG_ARGS_1(ins, ...) \
> +({\
> + asm volatile("1:\t" ins "\n" \
> + : : "D" (ARG1(__VA_ARGS__))); \
> +})
> +#define __BUG_ARGS_2(ins, ...) \
> +({\
> + asm volatile("1:\t" ins "\n" \
> + : : "D" (ARG1(__VA_ARGS__)), \
> + "S" (ARG2(__VA_ARGS__))); \
> +})
> +#define __BUG_ARGS_3(ins, ...) \
> +({\
> + asm volatile("1:\t" ins "\n" \
> + : : "D" (ARG1(__VA_ARGS__)), \
> + "S" (ARG2(__VA_ARGS__)), \
> + "d" (ARG3(__VA_ARGS__))); \
> +})
wouldn't you need to tie all these to (unused) outputs as well as adding
the remaining caller-saved registers to the clobber list? Maybe not for
the WARN machinery(?), but at least for stuff that should look like a
normal call to gcc? Then there's %rax which is either a clobber or an
output, and if there's not to be a separate static_call_void(), one
would need to do some __builtin_choose_expr(__same_type(void, f(...)), ...).
Rasmus
Powered by blists - more mailing lists