[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87ttj9h3kp.fsf@oracle.com>
Date: Tue, 07 May 2024 11:36:22 -0700
From: Stephen Brennan <stephen.s.brennan@...cle.com>
To: Christophe Leroy <christophe.leroy@...roup.eu>,
Steven Rostedt
<rostedt@...dmis.org>,
Masami Hiramatsu <mhiramat@...nel.org>,
Mark
Rutland <mark.rutland@....com>
Cc: Guo Ren <guoren@...nel.org>, Huacai Chen <chenhuacai@...nel.org>,
WANG
Xuerui <kernel@...0n.name>,
"James E.J. Bottomley"
<James.Bottomley@...senPartnership.com>,
Helge Deller <deller@....de>, Michael Ellerman <mpe@...erman.id.au>,
Nicholas Piggin
<npiggin@...il.com>,
"Aneesh Kumar K.V" <aneesh.kumar@...nel.org>,
"Naveen
N. Rao" <naveen.n.rao@...ux.ibm.com>,
Paul Walmsley
<paul.walmsley@...ive.com>,
Palmer Dabbelt <palmer@...belt.com>,
Albert Ou
<aou@...s.berkeley.edu>,
Heiko
Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
Alexander
Gordeev <agordeev@...ux.ibm.com>,
Christian Borntraeger
<borntraeger@...ux.ibm.com>,
Sven Schnelle <svens@...ux.ibm.com>,
Thomas
Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
Borislav
Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>,
"x86@...nel.org" <x86@...nel.org>, "H. Peter Anvin" <hpa@...or.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-trace-kernel@...r.kernel.org" <linux-trace-kernel@...r.kernel.org>,
"linux-csky@...r.kernel.org" <linux-csky@...r.kernel.org>,
"loongarch@...ts.linux.dev" <loongarch@...ts.linux.dev>,
"linux-parisc@...r.kernel.org" <linux-parisc@...r.kernel.org>,
"linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>,
"linux-riscv@...ts.infradead.org" <linux-riscv@...ts.infradead.org>,
"linux-s390@...r.kernel.org" <linux-s390@...r.kernel.org>
Subject: Re: [PATCH v3] kprobe/ftrace: bail out if ftrace was killed
Christophe Leroy <christophe.leroy@...roup.eu> writes:
> Le 01/05/2024 à 18:29, Stephen Brennan a écrit :
>> If an error happens in ftrace, ftrace_kill() will prevent disarming
>> kprobes. Eventually, the ftrace_ops associated with the kprobes will be
>> freed, yet the kprobes will still be active, and when triggered, they
>> will use the freed memory, likely resulting in a page fault and panic.
>>
>> This behavior can be reproduced quite easily, by creating a kprobe and
>> then triggering a ftrace_kill(). For simplicity, we can simulate an
>> ftrace error with a kernel module like [1]:
>>
>> [1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer
>>
>> sudo perf probe --add commit_creds
>> sudo perf trace -e probe:commit_creds
>> # In another terminal
>> make
>> sudo insmod ftrace_killer.ko # calls ftrace_kill(), simulating bug
>> # Back to perf terminal
>> # ctrl-c
>> sudo perf probe --del commit_creds
>>
>> After a short period, a page fault and panic would occur as the kprobe
>> continues to execute and uses the freed ftrace_ops. While ftrace_kill()
>> is supposed to be used only in extreme circumstances, it is invoked in
>> FTRACE_WARN_ON() and so there are many places where an unexpected bug
>> could be triggered, yet the system may continue operating, possibly
>> without the administrator noticing. If ftrace_kill() does not panic the
>> system, then we should do everything we can to continue operating,
>> rather than leave a ticking time bomb.
>>
>> Signed-off-by: Stephen Brennan <stephen.s.brennan@...cle.com>
>> ---
>> Changes in v3:
>> Don't expose ftrace_is_dead(). Create a "kprobe_ftrace_disabled"
>> variable and check it directly in the kprobe handlers.
>
> Isn't it safer to provide a fonction rather than a direct access to a
> variable ?
Is the concern that other code could modify this variable? If so, then I
suppose the function call is safer. But the variable is not exported and
I think built-in code can be trusted not to muck with it. Maybe I'm
missing your point about safety though?
> By the way, wouldn't it be more performant to use a static branch (jump
> label) ?
I agree with Steven's concern that text modification would unfortunately
not be a good way to handle an error in text modification. Especially, I
believe there could be deadlock risks, as static key enablement requires
taking the text_mutex and the jump_label_mutex. I'd be concerned that
the text_mutex could already be held in some situations where
ftrace_kill() is called. But I'm not certain about that.
Thanks for taking a look!
Stephen
Powered by blists - more mailing lists