[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aLiHQNBPOytj_85Q@gmail.com>
Date: Thu, 4 Sep 2025 03:21:52 +0900
From: Ryan Chung <seokwoo.chung130@...il.com>
To: Masami Hiramatsu <mhiramat@...nel.org>
Cc: Steven Rostedt <rostedt@...dmis.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
linux-trace-kernel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] trace_fprobe.c: TODO: handle filter, nofilter or
symbol list
On Tue, Sep 02, 2025 at 11:21:47AM +0900, Masami Hiramatsu wrote:
Hi Masami,
Thank you for your comments.
> Hi Ryan,
>
> Thanks for update.
>
> On Fri, 29 Aug 2025 19:20:50 +0900
> Ryan Chung <seokwoo.chung130@...il.com> wrote:
>
> > This v2 addresses the TODO in trace_fprobe to handle comma-separated
> > symbol lists and the '!' prefix. Tokens starting with '!' are collected
> > as "nofilter", and the others as "filter", then passed to
> > register_fprobe() accordingly. Empty tokens are rejected and errors are
> > reported with trace_probe_log_err().
>
> OK, can you describe how it changes the syntax. You may find more things
> to write it down.
>
> For example, fprobe is not only a function entry, but also it supports
> function return. How it is specified? Current "%return" suffix is introduced
> for single symbol (function), like schedule%return. If we introduce a list
> of symbols and filters, it looks more complicated.
>
I see your concern and where my confusion came from.
> For example, "!funcAB,funcC,funcA*%return" seems like the exit of funcA*,
> the entry of funcC, but not covers funcAB. It is naturally misleading
> users. We have to check "funcA*%return,!funcAB,funcC" pattern.
>
> Thus, I think we should use another suffix, like ":exit" (I think the colon
> does strongly separate than comma, what do you think?), or just
> prohibit to use "%return" but user needs to specify "$retval" in fetcharg
> to specify it is the fprobe on function exit. (this maybe more natural)
>
I agree with you here. Using another suffix will make it more clearer
for the user.
So in that sense, I am guessing you are suggesting:
:exit (return)
:entry (explicit entry)
So this applies to the entire list. If a comma or wildcard is present
and %return appears, the parser will reject it with -EINVAL and a
precise trace_probe_log() index. For a single symbol, we use the legacy
foo%return.
Ex)
funcA*,!funcAB,funcC # entry (default)
funcA*,!funcAB,funcC:exit # exit (spec-level)
schedule%return # single-symbol legacy
> The reason why I talked about how to specify the exit point of a function
> so far, is because the variables that can be accessed at the entrance
> and exit are different.
>
Ah I see.
> Also, fprobe supports event-name autogeneration from the symbol name,
> e.g. 'symbol__entry' or 'symbol__exit'. Recently I found the symbol
> already supports wildcards, so sanitize it from the event name [1]
>
> [1] https://lore.kernel.org/all/175535345114.282990.12294108192847938710.stgit@devnote2/
>
> To use this list-style filters, we may need to reject if there is no
> event name. Of cause we can generate event-name from the symbol list
> but you need to sanitize non alphabet-number characters.
>
> Ah, here is another one, symbol is used for ctx->funcname, and this is
> used for querying BTF. But obviously BTF context is unusable for this
> case. So we should set the ctx->funcname = NULL with listed filter.
>
So it seems like this TODO is actually a bit larger scope than the
patch anticipated.
In that sense, maybe we could:
- for Single, literaly symbol, keep autogen symbol__entry/symbol__exit
and sanitize wildcards.
- for List or wildcard, require an explicit [GROUP/]EVENT; reject if
ommitted. No autogen.
I don't completely understand ctx->funcname you mentioned for the
usecase for querying BTF. I will research it more.
> >
> > Questions for maintainers (to confirm my understanding):
> > * Parsing location: Masami suggested doing the parsing in the parse
> > stage (e.g., via parse_symbol_and_return()). v2 keeps the logic in
> > __register_trace_fprobe(), but I can move the call into the parsing
> > path in v3 if that is the preferred place. Is that correct?
>
> Most of above processes have been done in parse_symbol_and_return(),
> thus the parsing it should be done around there.
>
Thank you.
> > * Documentation: I plan to update the user-facing docs for fprobe
> > syntax. Is Documentation/trace/ the right place (e.g.,
> > Documentation/trace/fprobetrace.rst)?
>
> Yes, please explain it with examples.
>
> Also, can you add a testcase (in a sparate patch) for this syntax?)
>
Yes. I will add selftests under tools/testing/selftests/ftrace/:
Accept when list entry/exit; ! exclusions; whitespace variants.
Reject when empty tokens, leading/trailing commas, %return with lists or
wildcards, mixed forms.
Naming: autogen only for single literal; list/wildcard requires explicit
event name.
BTF: ctx->fullname == NULL when multi/wildcard.
I will add the following:
# entry across a list
echo 'f:net/tx_entry tcp_sendmsg*,!tcp_sendmsg_locked,tcp_sendpage' \
> /sys/kernel/tracing/dynamic_events
# exit across the same list (spec-level)
echo 'f:net/tx_exit tcp_sendmsg*,!tcp_sendmsg_locked,tcp_sendpage:exit' \
> /sys/kernel/tracing/dynamic_events
echo 1 > /sys/kernel/tracing/events/net/tx_entry/enable
echo 1 > /sys/kernel/tracing/events/net/tx_exit/enable
> Thank you,
>
> >
> > Link: https://lore.kernel.org/linux-trace-kernel/20250812162101.5981-1-seokwoo.chung130@gmail.com/
> > Signed-off-by: Ryan Chung <seokwoo.chung130@...il.com>
>
> > ---
> >
> > Changes in v2:
> > * Classify '!' tokens as nofilter, others as filter; pass both to
> > register_fprobe().
> > * Reject empty tokens; log errors with trace_probe_log_*().
> > * Use __free(kfree) for temporary buffers.
> > * Keep subject and style per "Submitting patches" (tabs, wrapping).
> > * No manual dedup (leave to ftrace).
> >
> > kernel/trace/trace_fprobe.c | 48 +++++++++++++++++++++++++++++++++++--
> > 1 file changed, 46 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c
> > index b36ade43d4b3..d731d9754a39 100644
> > --- a/kernel/trace/trace_fprobe.c
> > +++ b/kernel/trace/trace_fprobe.c
> > @@ -815,6 +815,11 @@ static int trace_fprobe_verify_target(struct trace_fprobe *tf)
> > static int __register_trace_fprobe(struct trace_fprobe *tf)
>
> This is not a good place to add anymore, because this change turned out
> not to meet the expected prerequisites. (when I commented TODO here,
> I didn't expected too.)
>
> Anyway, this is a good opportunity to review this TODO deeper.
>
> Thank you,
>
I see. My question is whether or not all the symbol and filter should be
done in a separate location or possibly separate function (i.e.,
parse_symbol_and_return()).
Unless you prefer dropping %return entirely now, I’ll keep it for
single-symbol compatibility and mark it legacy in
Documentation/trace/fprobetrace.rst. I’ll send v3 with the parser
move, the spec-level suffix, explicit-name rule for list/wildcard,
BTF guard, docs, and selftests.
> > {
> > int i, ret;
> > + const char *p, *q;
> > + size_t spec_len, flen = 0, nflen = 0, tlen;
> > + bool have_f = false, have_nf = false;
> > + char *filter __free(kfree) = NULL;
> > + char *nofilter __free(kfree) = NULL;
> >
> > /* Should we need new LOCKDOWN flag for fprobe? */
> > ret = security_locked_down(LOCKDOWN_KPROBES);
> > @@ -835,8 +840,47 @@ static int __register_trace_fprobe(struct trace_fprobe *tf)
> > if (trace_fprobe_is_tracepoint(tf))
> > return __regsiter_tracepoint_fprobe(tf);
> >
> > - /* TODO: handle filter, nofilter or symbol list */
> > - return register_fprobe(&tf->fp, tf->symbol, NULL);
> > + spec_len = strlen(tf->symbol);
> > + filter = kzalloc(spec_len + 1, GFP_KERNEL);
> > + nofilter = kzalloc(spec_len + 1, GFP_KERNEL);
> > + if (!filter || !nofilter)
> > + return -ENOMEM;
> > +
> > + p = tf->symbol;
> > + for (p = tf->symbol; p; p = q ? q + 1 : NULL) {
> > + q = strchr(p, ',');
> > + tlen = q ? (size_t)(q-p) : strlen(p);
> > +
> > + /* reject empty token */
> > + if (!tlen) {
> > + trace_probe_log_set_index(1);
> > + trace_probe_log_err(0, BAD_TP_NAME);
> > + return -EINVAL;
> > + }
> > +
> > + if (*p == '!') {
> > + if (tlen == 1) {
> > + trace_probe_log_set_index(1);
> > + trace_probe_log_err(0, BAD_TP_NAME);
> > + return -EINVAL;
> > + }
> > + if (have_nf)
> > + nofilter[nflen++] = ',';
> > + memcpy(nofilter + nflen, p + 1, tlen - 1);
> > + nflen += tlen - 1;
> > + have_nf = true;
> > + } else {
> > + if (have_f)
> > + filter[flen++] = ',';
> > + memcpy(filter + flen, p, tlen);
> > + flen += tlen;
> > + have_f = true;
> > + }
> > + }
> > +
> > + return register_fprobe(&tf->fp,
> > + have_f ? filter : NULL,
> > + have_nf ? nofilter : NULL);
> > }
> >
> > /* Internal unregister function - just handle fprobe and flags */
> > --
> > 2.43.0
> >
>
>
> --
> Masami Hiramatsu (Google) <mhiramat@...nel.org>
Best regards,
Ryan Chung
Powered by blists - more mailing lists