[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMj1kXEVjKGkRU_4JWH5d9YzT+pYVuEZYPNLw0VkUAb6d+W9kQ@mail.gmail.com>
Date: Tue, 21 Sep 2021 16:44:56 +0200
From: Ard Biesheuvel <ardb@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Frederic Weisbecker <frederic@...nel.org>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
James Morse <james.morse@....com>,
Quentin Perret <qperret@...gle.com>,
Mark Rutland <mark.rutland@....com>,
Christophe Leroy <christophe.leroy@...roup.eu>
Subject: Re: [PATCH 2/4] arm64: implement support for static call trampolines
On Tue, 21 Sept 2021 at 09:10, Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Tue, Sep 21, 2021 at 01:32:35AM +0200, Frederic Weisbecker wrote:
>
> > +#define __ARCH_DEFINE_STATIC_CALL_TRAMP(name, target) \
> > + asm(" .pushsection .static_call.text, \"ax\" \n" \
> > + " .align 3 \n" \
> > + " .globl " STATIC_CALL_TRAMP_STR(name) " \n" \
> > + STATIC_CALL_TRAMP_STR(name) ": \n" \
> > + " hint 34 /* BTI C */ \n" \
> > + " adrp x16, 1f \n" \
> > + " ldr x16, [x16, :lo12:1f] \n" \
> > + " cbz x16, 0f \n" \
> > + " br x16 \n" \
> > + "0: ret \n" \
> > + " .popsection \n" \
> > + " .pushsection .rodata, \"a\" \n" \
> > + " .align 3 \n" \
> > + "1: .quad " target " \n" \
> > + " .popsection \n")
>
> So I like what Christophe did for PPC32:
>
> https://lkml.kernel.org/r/6ec2a7865ed6a5ec54ab46d026785bafe1d837ea.1630484892.git.christophe.leroy@csgroup.eu
>
> Where he starts with an unconditional jmp and uses that IFF the offset
> fits and only does the data load when it doesn't. Ard, woulnd't that
> also make sense on ARM64? I'm thinking most in-kernel function pointers
> would actually fit, it's just the module muck that gets to have too
> large pointers, no?
>
Yeah, I'd have to page that back in. But it seems like the following
bti c
<branch>
adrp x16, <literal>
ldr x16, [x16, ...]
br x16
with <branch> either set to 'b target' for the near targets, 'ret' for
the NULL target, and 'nop' for the far targets should work, and the
architecture permits patching branches into NOPs and vice versa
without special synchronization. But I must be missing something here,
or why did we have that long discussion before?
Powered by blists - more mailing lists