[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251126093920.30763e89@gandalf.local.home>
Date: Wed, 26 Nov 2025 09:39:20 -0500
From: Steven Rostedt <rostedt@...dmis.org>
To: "Masami Hiramatsu (Google)" <mhiramat@...nel.org>
Cc: LKML <linux-kernel@...r.kernel.org>, Linux Trace Kernel
<linux-trace-kernel@...r.kernel.org>, Mathieu Desnoyers
<mathieu.desnoyers@...icios.com>, Tom Zanussi <zanussi@...nel.org>
Subject: Re: [PATCH v2] tracing: Add system trigger file to enable triggers
for all the system's events
On Wed, 26 Nov 2025 15:02:51 +0900
Masami Hiramatsu (Google) <mhiramat@...nel.org> wrote:
> > This also allows to remove a trigger from all events in a subsystem (even
> > if it's not a subsystem trigger!).
> >
>
> I have some comments below.
BTW, it's more appropriate to simply trim the email ;-)
> > --- a/kernel/trace/trace_events.c
> > +++ b/kernel/trace/trace_events.c
> > @@ -2168,51 +2168,52 @@ event_filter_write(struct file *filp, const char __user *ubuf, size_t cnt,
> >
> > static LIST_HEAD(event_subsystems);
> >
> > -static int subsystem_open(struct inode *inode, struct file *filp)
> > +struct trace_subsystem_dir *trace_get_system_dir(struct inode *inode)
> > {
> > - struct trace_subsystem_dir *dir = NULL, *iter_dir;
> > - struct trace_array *tr = NULL, *iter_tr;
> > - struct event_subsystem *system = NULL;
> > - int ret;
> > + struct trace_subsystem_dir *dir;
> > + struct trace_array *tr = NULL;
>
> nit: This also no need to be initialized.
Hmm, I guess this was needed in one of the versions I had before posting.
I'll fix in v3.
>
> >
> > - if (tracing_is_disabled())
> > - return -ENODEV;
> > + guard(mutex)(&event_mutex);
> > + guard(mutex)(&trace_types_lock);
> >
> > /* Make sure the system still exists */
> > - mutex_lock(&event_mutex);
> > - mutex_lock(&trace_types_lock);
> > - list_for_each_entry(iter_tr, &ftrace_trace_arrays, list) {
> > - list_for_each_entry(iter_dir, &iter_tr->systems, list) {
> > - if (iter_dir == inode->i_private) {
> > + list_for_each_entry(tr, &ftrace_trace_arrays, list) {
> > + list_for_each_entry(dir, &tr->systems, list) {
> > + if (dir == inode->i_private) {
> > /* Don't open systems with no events */
> > - tr = iter_tr;
> > - dir = iter_dir;
> > - if (dir->nr_events) {
> > - __get_system_dir(dir);
> > - system = dir->subsystem;
> > - }
> > - goto exit_loop;
> > + if (!dir->nr_events)
> > + return NULL;
> > + if (__trace_array_get(tr) < 0)
> > + return NULL;
> > + __get_system_dir(dir);
> > + return dir;
> > }
> > }
> > }
> > static ssize_t event_trigger_regex_write(struct file *file,
> > const char __user *ubuf,
> > size_t cnt, loff_t *ppos)
> > {
> > struct trace_event_file *event_file;
> > ssize_t ret;
> > - char *buf __free(kfree) = NULL;
> > + char *buf __free(kfree) = get_user_buf(ubuf, cnt);
> >
> > - if (!cnt)
> > + if (!buf)
> > return 0;
> >
> > - if (cnt >= PAGE_SIZE)
> > - return -EINVAL;
> > -
> > - buf = memdup_user_nul(ubuf, cnt);
> > if (IS_ERR(buf))
> > return PTR_ERR(buf);
>
> You can simply write:
>
> if (IS_ERR_OR_NULL(buf))
> return PTR_ERR_OR_ZERO(buf);
Yes I can. But honestly, the above is much harder to understand what is
happening than the code I had written.
I mean:
if (!buf)
return 0;
if (IS_ERR(buf))
return PTR_ERR(buf);
is pretty obvious of what is happening.
if (IS_ERR_OR_NULL(buf))
return PTR_ERR_OR_ZERO(buf);
Is quite a bit more obfuscated. I mean, I needed to look up what
PTR_ERR_OR_ZERO() did to be sure I knew what was returned.
> > + list_for_each_entry(file, &tr->events, list) {
> > +
> > + if (strcmp(system->name, file->event_call->class->system) != 0)
> > + continue;
> > +
> > + ret = p->parse(p, file, buff, command, next);
> > +
> > + /* Removals and existing events do not error */
> > + if (ret < 0 && ret != -EEXIST && !remove) {
> > + pr_warn("Failed adding trigger %s on %s\n",
> > + command, trace_event_name(file->event_call));
> > + }
>
>
> Can I expect that this can recover the previous settings
> via event trigger?
> e.g.
>
> # echo "stacktrace" > events/sched/sched_wakeup/trigger
> # echo "stacktrace" > events/sched/trigger
> # echo "!stacktrace" > events/sched/trigger
> # cat events/sched/sched_wakeup/trigger
> stacktrace:unlimited
>
> ?
No. In fact, this is one of the features of the system trigger. Writing
into the system/trigger file is the same as writing into each of the
system's event's trigger files one at a time. In fact, I updated the
documentation in this patch to show that this file can be used to clear
tiggers too!
+ echo snapshot > events/sched/sched_waking/trigger
+ cat events/sched/sched_waking/trigger
+ snapshot:unlimited
+ echo '!snapshot' > events/sched/trigger
+ cat events/sched/sched_waking/trigger
+ # Available triggers:
+ # traceon traceoff snapshot stacktrace enable_event disable_event enable_hist disable_hist hist
>
>
> > + }
> > + return 0;
> > +}
> > +
> > +static ssize_t
> > +event_system_trigger_write(struct file *filp, const char __user *ubuf,
> > + size_t cnt, loff_t *ppos)
> > +{
> > + struct trace_subsystem_dir *dir = filp->private_data;
> > + struct event_command *p;
> > + char *command, *next;
> > + char *buf __free(kfree) = get_user_buf(ubuf, cnt);
> > + bool remove = false;
> > + bool found = false;
> > + ssize_t ret;
> > +
> > + if (!buf)
> > + return 0;
> > +
> > + if (IS_ERR(buf))
> > + return PTR_ERR(buf);
>
> Ditto.
And ditto again with my reply ;-)
>
> > +
> > + /* system triggers are not allowed to have counters */
> > + if (strchr(buf, ':'))
> > + return -EINVAL;
>
> ':' is not always used for counters (e.g. hist) and it seems odd
> to check anything about parse here. Can we do this counter check
> after parse a command?
>
> > +
> > + /* If opened for read too, dir is in the seq_file descriptor */
> > + if (filp->f_mode & FMODE_READ) {
> > + struct seq_file *m = filp->private_data;
> > + dir = m->private;
> > + }
> > +
> > + /* Skip added space at beginning of buf */
> > + next = strim(buf);
> > +
> > + command = strsep(&next, " \t");
> > + if (next) {
> > + next = skip_spaces(next);
> > + if (!*next)
> > + next = NULL;
> > + }
>
> strim() removes both leading and trailing whitespace. So this check is
> not required.
But next here is not the one that had strim() attached to it.
command = strsep(&next, " \t");
Updates the content of next.
Thanks for the review,
-- Steve
Powered by blists - more mailing lists