linux-kernel - Re: kprobes broken since 0d00449c7a28 ("x86: Replace ist_enter() with nmi

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <783eafee-681c-45b6-17ef-24473cb33aa1@suse.com>
Date:   Fri, 29 Jan 2021 08:36:30 +0200
From:   Nikolay Borisov <nborisov@...e.com>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>,
        Peter Zijlstra <peterz@...radead.org>
Cc:     Masami Hiramatsu <masami.hiramatsu@...il.com>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Alexei Starovoitov <ast@...nel.org>, bpf@...r.kernel.org,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Steven Rostedt <rostedt@...dmis.org>
Subject: Re: kprobes broken since 0d00449c7a28 ("x86: Replace ist_enter() with
 nmi_enter()")



On 29.01.21 г. 3:34 ч., Alexei Starovoitov wrote:
> On Thu, Jan 28, 2021 at 07:24:14PM +0100, Peter Zijlstra wrote:
>> On Thu, Jan 28, 2021 at 06:45:56PM +0200, Nikolay Borisov wrote:
>>> it would be placed on the __fentry__ (and not endbr64) hence it works.
>>> So perhaps a workaround outside of bpf could essentially detect this
>>> scenario and adjust the probe to be on the __fentry__ and not preceding
>>> instruction if it's detected to be endbr64 ?
>>
>> Arguably the fentry handler should also set the nmi context, it can,
>> after all, interrupt pretty much any other context by construction.
> 
> But that doesn't make it a real nmi.
> nmi can actually interrupt anything. Whereas kprobe via int3 cannot
> do nokprobe and noinstr sections. The exposure is a lot different.
> ftrace is even more contained. It's only at the start of the functions.
> It's even smaller subset of places than kprobes.
> So ftrace < kprobe < nmi.
> Grouping them all into nmi doesn't make sense to me.
> That bpf breaking change came unnoticed mostly because people use
> kprobes in the beginning of the functions only, but there are cases
> where kprobes are in the middle too. People just didn't upgrade kernels
> fast enough to notice.

nit: slight correction - I observed while I was putting kprobes at the
beginning of the function but __fentry__ wasn't the first thing in the
function's code. The reason why people haven't observed is because
everyone is running with retpolines enabled which disables CFI/CET.

> imo appropriate solution would be to have some distinction between
> ftrace < kprobe_via_int3 < nmi, so that bpf side can react accordingly.
> That code in trace_call_bpf:
>   if (in_nmi()) /* not supported yet */
> was necessary in the past. Now we have all sorts of protections built in.
> So it's probably ok to just drop it, but I think adding
> called_via_ftrace vs called_via_kprobe_int3 vs called_via_nmi
> is more appropriate solution long term.
>