linux-kernel - Re: [PATCH v3] kprobe/ftrace: bail out if ftrace was killed

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87ttj9h3kp.fsf@oracle.com>
Date: Tue, 07 May 2024 11:36:22 -0700
From: Stephen Brennan <stephen.s.brennan@...cle.com>
To: Christophe Leroy <christophe.leroy@...roup.eu>,
        Steven Rostedt
 <rostedt@...dmis.org>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Mark
 Rutland <mark.rutland@....com>
Cc: Guo Ren <guoren@...nel.org>, Huacai Chen <chenhuacai@...nel.org>,
        WANG
 Xuerui <kernel@...0n.name>,
        "James E.J. Bottomley"
 <James.Bottomley@...senPartnership.com>,
        Helge Deller <deller@....de>, Michael Ellerman <mpe@...erman.id.au>,
        Nicholas Piggin
 <npiggin@...il.com>,
        "Aneesh Kumar K.V" <aneesh.kumar@...nel.org>,
        "Naveen
 N. Rao" <naveen.n.rao@...ux.ibm.com>,
        Paul Walmsley
 <paul.walmsley@...ive.com>,
        Palmer Dabbelt <palmer@...belt.com>,
        Albert Ou
 <aou@...s.berkeley.edu>,
        Heiko
 Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
        Alexander
 Gordeev <agordeev@...ux.ibm.com>,
        Christian Borntraeger
 <borntraeger@...ux.ibm.com>,
        Sven Schnelle <svens@...ux.ibm.com>,
        Thomas
 Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
        Borislav
 Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "x86@...nel.org" <x86@...nel.org>, "H. Peter Anvin" <hpa@...or.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-trace-kernel@...r.kernel.org" <linux-trace-kernel@...r.kernel.org>,
        "linux-csky@...r.kernel.org" <linux-csky@...r.kernel.org>,
        "loongarch@...ts.linux.dev" <loongarch@...ts.linux.dev>,
        "linux-parisc@...r.kernel.org" <linux-parisc@...r.kernel.org>,
        "linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>,
        "linux-riscv@...ts.infradead.org" <linux-riscv@...ts.infradead.org>,
        "linux-s390@...r.kernel.org" <linux-s390@...r.kernel.org>
Subject: Re: [PATCH v3] kprobe/ftrace: bail out if ftrace was killed

Christophe Leroy <christophe.leroy@...roup.eu> writes:
> Le 01/05/2024 à 18:29, Stephen Brennan a écrit :
>> If an error happens in ftrace, ftrace_kill() will prevent disarming
>> kprobes. Eventually, the ftrace_ops associated with the kprobes will be
>> freed, yet the kprobes will still be active, and when triggered, they
>> will use the freed memory, likely resulting in a page fault and panic.
>> 
>> This behavior can be reproduced quite easily, by creating a kprobe and
>> then triggering a ftrace_kill(). For simplicity, we can simulate an
>> ftrace error with a kernel module like [1]:
>> 
>> [1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer
>> 
>>    sudo perf probe --add commit_creds
>>    sudo perf trace -e probe:commit_creds
>>    # In another terminal
>>    make
>>    sudo insmod ftrace_killer.ko  # calls ftrace_kill(), simulating bug
>>    # Back to perf terminal
>>    # ctrl-c
>>    sudo perf probe --del commit_creds
>> 
>> After a short period, a page fault and panic would occur as the kprobe
>> continues to execute and uses the freed ftrace_ops. While ftrace_kill()
>> is supposed to be used only in extreme circumstances, it is invoked in
>> FTRACE_WARN_ON() and so there are many places where an unexpected bug
>> could be triggered, yet the system may continue operating, possibly
>> without the administrator noticing. If ftrace_kill() does not panic the
>> system, then we should do everything we can to continue operating,
>> rather than leave a ticking time bomb.
>> 
>> Signed-off-by: Stephen Brennan <stephen.s.brennan@...cle.com>
>> ---
>> Changes in v3:
>>    Don't expose ftrace_is_dead(). Create a "kprobe_ftrace_disabled"
>>    variable and check it directly in the kprobe handlers.
>
> Isn't it safer to provide a fonction rather than a direct access to a 
> variable ?

Is the concern that other code could modify this variable? If so, then I
suppose the function call is safer. But the variable is not exported and
I think built-in code can be trusted not to muck with it. Maybe I'm
missing your point about safety though?

> By the way, wouldn't it be more performant to use a static branch (jump 
> label) ?

I agree with Steven's concern that text modification would unfortunately
not be a good way to handle an error in text modification. Especially, I
believe there could be deadlock risks, as static key enablement requires
taking the text_mutex and the jump_label_mutex. I'd be concerned that
the text_mutex could already be held in some situations where
ftrace_kill() is called. But I'm not certain about that.

Thanks for taking a look!
Stephen