[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250815135212.GA1386988@noisy.programming.kicks-ass.net>
Date: Fri, 15 Aug 2025 15:52:12 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Nam Cao <namcao@...utronix.de>
Cc: Steven Rostedt <rostedt@...dmis.org>,
Masami Hiramatsu <mhiramat@...nel.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Gabriele Monaco <gmonaco@...hat.com>,
linux-trace-kernel@...r.kernel.org, linux-kernel@...r.kernel.org,
Ingo Molnar <mingo@...hat.com>, Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Valentin Schneider <vschneid@...hat.com>,
K Prateek Nayak <kprateek.nayak@....com>
Subject: Re: [PATCH v2 4/5] sched: Add task enqueue/dequeue trace points
On Fri, Aug 15, 2025 at 03:40:17PM +0200, Peter Zijlstra wrote:
> On Wed, Aug 06, 2025 at 10:01:20AM +0200, Nam Cao wrote:
>
> > +/*
> > + * The two trace points below may not work as expected for fair tasks due
> > + * to delayed dequeue. See:
> > + * https://lore.kernel.org/lkml/179674c6-f82a-4718-ace2-67b5e672fdee@amd.com/
> > + */
>
> > +DECLARE_TRACE(dequeue_task,
> > + TP_PROTO(int cpu, struct task_struct *task),
> > + TP_ARGS(cpu, task));
> > +
>
> > @@ -2119,7 +2121,11 @@ inline bool dequeue_task(struct rq *rq, struct task_struct *p, int flags)
> > * and mark the task ->sched_delayed.
> > */
> > uclamp_rq_dec(rq, p);
> > - return p->sched_class->dequeue_task(rq, p, flags);
> > + if (p->sched_class->dequeue_task(rq, p, flags)) {
> > + trace_dequeue_task_tp(rq->cpu, p);
> > + return true;
> > + }
> > + return false;
> > }
>
> Hurmpff.. that's not very nice.
>
> How about something like:
>
> dequeue_task():
> ...
> ret = p->sched_class->dequeue_task(rq, p, flags);
> if (trace_dequeue_task_p_enabled() && !(flags & DEQUEUE_SLEEP))
> __trace_dequeue_task_tp(rq->cpu, p);
> return ret;
>
>
> __block_task():
> trace_dequeue_task_tp(rq->cpu, p);
> ...
>
>
> Specifically, only DEQUEUE_SLEEP is allowed to fail, and DEQUEUE_SLEEP
> will eventually cause __block_task() to be called, either directly, or
> delayed.
If you extend the tracepoint with the sleep state, you can probably
remove the nr_running tracepoints. Esp. once we get this new throttle
stuff sorted.
Powered by blists - more mailing lists