[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e1ccc82d-6939-4a0f-8911-11ae674cfd8a@efficios.com>
Date: Sun, 15 Dec 2024 09:39:24 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>, Masami Hiramatsu <mhiramat@...nel.org>,
Mark Rutland <mark.rutland@....com>, Al Viro <viro@...iv.linux.org.uk>,
Michal Simek <monstr@...str.eu>
Subject: Re: [GIT PULL] ftrace: Fixes for v6.13
On 2024-12-15 08:47, Steven Rostedt wrote:
> On Sun, 15 Dec 2024 07:42:35 -0500
> Mathieu Desnoyers <mathieu.desnoyers@...icios.com> wrote:
>
>> On 2024-12-15 05:05, Steven Rostedt wrote:
>>> On Sat, 14 Dec 2024 21:19:01 -0800
>>> Linus Torvalds <torvalds@...ux-foundation.org> wrote:
>>
>> [...]
>>
>>>>
>>>> Just disable it unconditionally.
>>>>
>>>
>>> I can do that, but I'm not looking forward to seeing random crashes in the
>>> trace event code again :-(
>>>
>>> Honestly, I did not like this code when I wrote it, but I have no idea how
>>> to stop the "%s" bug from happening before it gets out to production. This
>>> worked. Do you have any suggestions for alternatives?
>>
>> IMHO, deferred execution of TP_printk() code in kernel context is
>> a fundamental mistake causing all those problems. This opens the
>> door to store pointers to strings (or anything else really)
>> that sit in kernel modules which can be unloaded between
>
> Module unloading will clear out the ring buffers to prevent issues.
As a side-effect issues caused by module unloading won't be
observable with tracing.
>
>> tracing and TP_printk() execution, or as we are seeing here
>> pointers to data which can be mapped at different addresses
>> across kernel reboot, into the ring buffer.
>>
>> If TP_printk() don't have access to load data from random kernel
>> memory in the first place, and can only read from the buffer, we
>> would not be having those misuses, and there would be nothing to
>> work-around as the strings/data would all be serialized into the
>> ring buffer.
>>
>> In LTTng we've taken the approach to only read the trace data
>> at post-processing from user-space (we don't have the equivalent
>> of TP_printk(), and that's on purpose).
>>
>> I wonder if we could keep the ftrace trace_pipe pretty-printing
>> behavior, while isolating the TP_printk() execution into a
>> userspace process which would only map the ring buffer ? This way,
>
> That would change the entire use of tracefs, especially in the embedded
> world. Note, this hasn't been a major issue since the test/check logic was
> put in place. It catches pretty much all issues with the delayed printing.
This is not at all what I have in mind, so let me rephrase.
What I am saying is: is there a way we could execute TP_printk()
in userspace mode _while preserving the trace_pipe tracefs ABI_ ?
I suspect that inserting this small userspace program into the
kernel image with objcopy would be a start. Then adapting the
usermode helper code to run a program from a preexisting
in-kernel copy could be a second step. Then modifying trace_pipe
so it blocks and communicates with this helper program to
consume the formatted output would come last.
Thanks,
Mathieu
>
> -- Steve
>
>
>> users trying to misuse TP_printk() would get immediate feedback
>> about their mistake because they cannot print the trace. We could
>> print a dmesg warning about crash of a usermode helper program,
>> for instance.
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Powered by blists - more mailing lists