[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <A9379DDB-3210-407C-8157-8DA980944F8C@amacapital.net>
Date: Mon, 8 Oct 2018 01:33:14 -0700
From: Andy Lutomirski <luto@...capital.net>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Steven Rostedt <rostedt@...dmis.org>, linux-kernel@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Ingo Molnar <mingo@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Masami Hiramatsu <mhiramat@...nel.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Matthew Helsley <mhelsley@...are.com>,
"Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
David Woodhouse <dwmw2@...radead.org>,
Paolo Bonzini <pbonzini@...hat.com>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Jason Baron <jbaron@...mai.com>, Jiri Kosina <jkosina@...e.cz>,
ard.biesheuvel@...aro.org, Andy Lutomirski <luto@...nel.org>
Subject: Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
> On Oct 8, 2018, at 12:21 AM, Peter Zijlstra <peterz@...radead.org> wrote:
>
>> On Sat, Oct 06, 2018 at 09:39:05AM -0400, Steven Rostedt wrote:
>> On Sat, 6 Oct 2018 14:12:11 +0200
>> Peter Zijlstra <peterz@...radead.org> wrote:
>>
>>>> On Fri, Oct 05, 2018 at 09:51:11PM -0400, Steven Rostedt wrote:
>>>> +#define arch_dynfunc_trampoline(name, def) \
>>>> + asm volatile ( \
>>>> + ".globl dynfunc_" #name "; \n\t" \
>>>> + "dynfunc_" #name ": \n\t" \
>>>> + "jmp " #def " \n\t" \
>>>> + ".balign 8 \n \t" \
>>>> + : : : "memory" )
>>>
>>> Bah, what is it with you people and trampolines. Why can't we, just like
>>> jump_label, patch the call directly?
>>>
>>> The whole call+jmp thing is silly, don't do that. It just wrecks I$ and
>>> is slower for no real reason afaict.
>>
>> My first attempt was to do just that. But to add a label at the
>> call site required handling all the parameters too. See my branch:
>> ftrace/jump_function-v1 for how ugly it got (and it didn't work).
>
> Can't we hijack the relocation records for these functions before they
> get thrown out in the (final) link pass or something?
I could be talking out my arse here, but I thought we could do this, too, then changed my mind. The relocation records give us the location of the call or jump operand, but they don’t give the address of the beginning of the instruction. If the instruction crosses a cache line boundary, don’t we need to use the int3 patching trick? And that requires knowing which byte to patch with int3.
Or am I wrong and can the CPUs we care about correctly handle a locked write to part of an instruction that crosses a cache line boundary?
Powered by blists - more mailing lists