[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YjGvauc0NYh2XXoc@hirez.programming.kicks-ass.net>
Date: Wed, 16 Mar 2022 10:35:38 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Kumar Kartikeya Dwivedi <memxor@...il.com>
Cc: Alexei Starovoitov <alexei.starovoitov@...il.com>,
X86 ML <x86@...nel.org>, joao@...rdrivepizza.com,
hjl.tools@...il.com, Josh Poimboeuf <jpoimboe@...hat.com>,
Andrew Cooper <andrew.cooper3@...rix.com>,
LKML <linux-kernel@...r.kernel.org>,
Nick Desaulniers <ndesaulniers@...gle.com>,
Kees Cook <keescook@...omium.org>,
Sami Tolvanen <samitolvanen@...gle.com>,
Mark Rutland <mark.rutland@....com>, alyssa.milburn@...el.com,
Miroslav Benes <mbenes@...e.cz>,
Steven Rostedt <rostedt@...dmis.org>,
Masami Hiramatsu <mhiramat@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>, bpf <bpf@...r.kernel.org>
Subject: Re: [PATCH v4 00/45] x86: Kernel IBT
On Tue, Mar 15, 2022 at 10:00:43AM +0100, Peter Zijlstra wrote:
> On Tue, Mar 15, 2022 at 02:14:02AM +0530, Kumar Kartikeya Dwivedi wrote:
>
> > [ Note: I have no experience with trampoline code or IBT so what follows might
> > be incorrect. ]
> >
> > In case of fexit and fmod_ret, we call original function (but skip
> > X86_PATCH_SIZE bytes), with ENDBR we must also skip those 4 bytes, but in some
> > cases like bpf_fentry_test1, for which this test has fmod_ret prog, compiler
> > (gcc 11) emits endbr64, but not for do_init_module, for which we do fexit.
> >
> > This means for do_init_module module, orig_call += X86_PATCH_SIZE +
> > ENDBR_INSN_SIZE would skip more bytes than needed to emit call to original
> > function, which explains why I was seeing crash in the middle of
> > 'mov edx, 0x10' instruction.
> >
> > The diff below fixes the problem for me, and allows the test to pass.
> >
> > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > index b98e1c95bcc4..760c9a3c075f 100644
> > --- a/arch/x86/net/bpf_jit_comp.c
> > +++ b/arch/x86/net/bpf_jit_comp.c
> > @@ -2031,11 +2031,14 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> >
> > ip_off = stack_size;
> >
> > - if (flags & BPF_TRAMP_F_SKIP_FRAME)
> > + if (flags & BPF_TRAMP_F_SKIP_FRAME) {
> > /* skip patched call instruction and point orig_call to actual
> > * body of the kernel function.
> > */
> > - orig_call += X86_PATCH_SIZE + ENDBR_INSN_SIZE;
> > + if (is_endbr(*(u32 *)orig_call))
> > + orig_call += ENDBR_INSN_SIZE;
> > + orig_call += X86_PATCH_SIZE;
> > + }
> >
> > prog = image;
>
> Hmm, so I was under the impression that this was targeting the NOP from
> emit_prologue(), and that has an unconditional ENDBR. If this is instead
> targeting the 'start of random kernel function' then yes, what you
> propose will work.
Can you confirm that orig_call can be any kernel function? Because if
so, I'm thinking it will still do the wrong thing for a notrace
function, that will not have a __fentry__ site, so unconditionally
skipping those 5 bytes will place you somewhere non-sensible.
This would not be a new issue; but perhaps it should be clarified and or
fixed.
Powered by blists - more mailing lists