linux-kernel - Re: [PATCH] trace: Add trace points for tasklet entry/exit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230104232918.2791484c39ac54c59a7110f4@kernel.org>
Date:   Wed, 4 Jan 2023 23:29:18 +0900
From:   Masami Hiramatsu (Google) <mhiramat@...nel.org>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     John Stultz <jstultz@...gle.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Lingutla Chandrasekhar <clingutla@...eaurora.org>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        "Connor O'Brien" <connoro@...gle.com>, kernel-team@...roid.com,
        "J . Avila" <elavila@...gle.com>
Subject: Re: [PATCH] trace: Add trace points for tasklet entry/exit

On Tue, 3 Jan 2023 15:15:54 -0500
Steven Rostedt <rostedt@...dmis.org> wrote:

> On Tue,  3 Jan 2023 18:54:08 +0000
> John Stultz <jstultz@...gle.com> wrote:
> 
> > From: Lingutla Chandrasekhar <clingutla@...eaurora.org>
> > 
> > Tasklets are supposed to finish their work quickly and
> > should not block the current running process, but it is not
> > guaranteed that. Currently softirq_entry/exit can be used to
> > know total tasklets execution time, but not helpful to track
> > individual tasklet's execution time. With that we can't find
> > any culprit tasklet function, which is taking more time.
> > 
> > Add tasklet_entry/exit trace point support to track
> > individual tasklet execution.
> > 
> > This patch has been carried in the Android tree for awhile
> > so I wanted to submit it for review upstream. Feedback would
> > be appreciated!
> > 
> > Cc: Steven Rostedt <rostedt@...dmis.org>
> > Cc: Masami Hiramatsu <mhiramat@...nel.org>
> > Cc: Thomas Gleixner <tglx@...utronix.de>
> > Cc: "Paul E. McKenney" <paulmck@...nel.org>
> > Cc: Connor O'Brien <connoro@...gle.com>
> > Cc: kernel-team@...roid.com
> > Reviewed-by: Steven Rostedt (Google) <rostedt@...dmis.org>
> > Signed-off-by: Lingutla Chandrasekhar <clingutla@...eaurora.org>
> > [elavila: Port to android-mainline]
> > Signed-off-by: J. Avila <elavila@...gle.com>
> > [jstultz: Rebased to upstream, cut unused trace points, added
> >  comments for the tracepoints, reworded commit]
> > Signed-off-by: John Stultz <jstultz@...gle.com>
> > ---
> >  include/trace/events/irq.h | 43 ++++++++++++++++++++++++++++++++++++++
> >  kernel/softirq.c           |  9 ++++++--
> >  2 files changed, 50 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/trace/events/irq.h b/include/trace/events/irq.h
> > index eeceafaaea4c..da85851d4ec1 100644
> > --- a/include/trace/events/irq.h
> > +++ b/include/trace/events/irq.h
> > @@ -160,6 +160,49 @@ DEFINE_EVENT(softirq, softirq_raise,
> >  	TP_ARGS(vec_nr)
> >  );
> >  
> > +DECLARE_EVENT_CLASS(tasklet,
> > +
> > +	TP_PROTO(void *func),
> > +
> > +	TP_ARGS(func),
> > +
> > +	TP_STRUCT__entry(
> 
> Could you also add a pointer to the tasklet too?
> 
> 		__field(	void *, tasklet)
> 
> > +		__field(	void *,	func)
> > +	),
> > +
> > +	TP_fast_assign(
> 
> 		__entry->tasklet = t;
> 
> > +		__entry->func = func;
> > +	),
> > +
> > +	TP_printk("function=%ps", __entry->func)
> 
> This way if we wanted more information, we could use event probes:
> 
>  # echo 'e:tasklet_info tasklet/tasklet_entry state=+8($tasklet):u64' > dynamic_events

Hmm, what about saving 'state' and 'count' instead of 'tasklet'?

I have a question about the basic policy of making a new tracepoint.

Of course we can expand the event with eprobes as you said, but without
eprobe, this 'tasklet' field of this event just exposing a kernel
internal object address. That is useless in most cases. And also the
offset (layout) in the kernel data structure can be changed by some
debug options. We need an external tool to find correct offset (e.g.
perf probe).

So my question is when adding a new event, whether it should expose a
(address of) related data structure, or expose some value fields of
the structure. IMHO, the basic policy is latter. Of course if the
data structure is enough big and most of its fields are usually not
interesting, it may be better to save the data structure itself.

Thank you,

> 
> -- Steve
> 
> 
> > +);
> > +
> > +/**
> > + * tasklet_entry - called immediately before the tasklet is run
> > + * @func:  tasklet callback or function being run
> > + *
> > + * Used to find individual tasklet execution time
> > + */
> > +DEFINE_EVENT(tasklet, tasklet_entry,
> > +
> > +	TP_PROTO(void *func),
> > +
> > +	TP_ARGS(func)
> > +);
> > +
> > +/**
> > + * tasklet_exit - called immediately after the tasklet is run
> > + * @func:  tasklet callback or function being run
> > + *
> > + * Used to find individual tasklet execution time
> > + */
> > +DEFINE_EVENT(tasklet, tasklet_exit,
> > +
> > +	TP_PROTO(void *func),
> > +
> > +	TP_ARGS(func)
> > +);
> > +
> >  #endif /*  _TRACE_IRQ_H */
> >  
> >  /* This part must be outside protection */
> > diff --git a/kernel/softirq.c b/kernel/softirq.c
> > index c8a6913c067d..dbd322524171 100644
> > --- a/kernel/softirq.c
> > +++ b/kernel/softirq.c
> > @@ -793,10 +793,15 @@ static void tasklet_action_common(struct softirq_action *a,
> >  		if (tasklet_trylock(t)) {
> >  			if (!atomic_read(&t->count)) {
> >  				if (tasklet_clear_sched(t)) {
> > -					if (t->use_callback)
> > +					if (t->use_callback) {
> > +						trace_tasklet_entry(t->callback);
> >  						t->callback(t);
> > -					else
> > +						trace_tasklet_exit(t->callback);
> > +					} else {
> > +						trace_tasklet_entry(t->func);
> >  						t->func(t->data);
> > +						trace_tasklet_exit(t->func);
> > +					}
> >  				}
> >  				tasklet_unlock(t);
> >  				continue;
> 


-- 
Masami Hiramatsu (Google) <mhiramat@...nel.org>