[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <dc36b163-5626-4d39-bd8f-35dc353bef17@efficios.com>
Date: Thu, 7 Nov 2024 11:36:00 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Marco Elver <elver@...gle.com>, Kees Cook <keescook@...omium.org>,
Masami Hiramatsu <mhiramat@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>, Oleg Nesterov <oleg@...hat.com>,
linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
Dmitry Vyukov <dvyukov@...gle.com>, kasan-dev@...glegroups.com
Subject: Re: [PATCH v2 1/2] tracing: Add task_prctl_unknown tracepoint
On 2024-11-07 11:04, Steven Rostedt wrote:
> On Thu, 7 Nov 2024 10:52:37 -0500
> Mathieu Desnoyers <mathieu.desnoyers@...icios.com> wrote:
>
>> I suspect you base the overhead analysis on the x86-64 implementation
>> of sys_enter/exit tracepoint and especially the overhead caused by
>> the SYSCALL_WORK_SYSCALL_TRACEPOINT thread flag, am I correct ?
>>
>> If that is causing a too large overhead, we should investigate if
>> those can be improved instead of adding tracepoints in the
>> implementation of system calls.
>
> That would be great to get better, but the reason I'm not against this
> patch is because prctl() is not a normal system call. It's basically an
> ioctl() for Linux, and very vague. It's basically the garbage system call
> when you don't know what to do. It's even being proposed for the sframe
> work.
>
> I understand your sentiment and agree. I don't want any random system call
> to get a tracepoint attached to it. But here I'd make an exception.
Should we document this as an "instrumentation good practice" then ?
When the system call is a multiplexor such as ioctl(2) and prctl(2),
then instrumenting it with tracepoints within each of the "op" case
makes sense for overall maintainability.
For non-multiplexor system calls, using the existing sys_enter/exit
tracepoints should be favored.
This opens the following question for non-multiplexors system calls:
considering that the overhead of the current sys_enter/exit
instrumentation is deemed to large to use in production, perhaps
we should consider a few alternatives, namely:
A) Modify SYSCALL_DEFINE so it emits a function wrapper with tracepoints
for each system call enter/exit, except for multiplexors, or
B) Add the plumbing required to allow system call tracing to be
activated for specific system calls only, more fine-grained than
the current system-wide for_each_process_thread()
SYSCALL_WORK_SYSCALL_TRACEPOINT thread flag big hammer.
Another scenario to consider is system calls that have iovec arguments.
Should we add tracepoint within the iovec iteration, or should it target
the entire iovec as input/output at system call enter/exit ?
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Powered by blists - more mailing lists