lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <7364B113-DD65-423D-BED3-FF90C4DF8334@amacapital.net>
Date:   Tue, 22 Oct 2019 15:45:26 -0700
From:   Andy Lutomirski <luto@...capital.net>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:     Steven Rostedt <rostedt@...dmis.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        LKML <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>,
        Nadav Amit <nadav.amit@...il.com>,
        Andy Lutomirski <luto@...nel.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Song Liu <songliubraving@...com>,
        Masami Hiramatsu <mhiramat@...nel.org>
Subject: Re: [PATCH 3/3] x86/ftrace: Use text_poke()



>> On Oct 22, 2019, at 2:58 PM, Alexei Starovoitov <alexei.starovoitov@...il.com> wrote:
>> 
>> On Tue, Oct 22, 2019 at 05:04:30PM -0400, Steven Rostedt wrote:
>> I gave a solution for this. And that is to add another flag to allow
>> for just the minimum to change the ip. And we can even add another flag
>> to allow for changing the stack if needed (to emulate a call with the
>> same parameters).
> 
> your solution is to reduce the overhead.
> my solution is to remove it competely. See the difference?
> 
>> By doing this work, live kernel patching will also benefit. Because it
>> is also dealing with the unnecessary overhead of saving regs.
>> And we could possibly even have kprobes benefit from this if a kprobe
>> doesn't need full regs.
> 
> Neither of two statements are true. The per-function generated trampoline
> I'm talking about is bpf specific. For a function with two arguments it's just:
> push rbp 
> mov rbp, rsp
> push rdi
> push rsi
> lea  rdi,[rbp-0x10]
> call jited_bpf_prog
> pop rsi
> pop rdi
> leave
> ret

Why are you saving rsi?  You said upthread that you’re saving the args, but rsi is already available in rsi.

Just how custom is this bpf program?  It seems to clobber no regs (except args), and it doesn’t return anything. Is it entirely specific to the probed function?  If so, why not just call it directly?

In any event, I think you can’t *just* use text_poke.  Something needs to coordinate to ensure that, if you bpf trace an already-kprobed function, the right thing happens.

FWIW, if you are going to use a trampoline like this, consider using r11 for the caller frame instead of rsi.  You won’t need to restore it. But I’m wondering whether the bpf jitted code could just directly access the frame instead of indirecting through a register. (Or am I entirely misunderstanding what rdi is for in your example?)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ