[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aMgana01-nsNq-XB@jlelli-thinkpadt14gen4.remote.csb>
Date: Mon, 15 Sep 2025 15:54:37 +0200
From: Juri Lelli <juri.lelli@...hat.com>
To: Marcel Ziswiler <marcel.ziswiler@...ethink.co.uk>
Cc: linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Vineeth Pillai <vineeth@...byteword.org>,
Luca Abeni <luca.abeni@...tannapisa.it>
Subject: Re: SCHED_DEADLINE tasks causing WARNING at kernel/sched/sched.h
message
On 12/09/25 12:03, Marcel Ziswiler wrote:
> Hi Juri
>
> Thanks for getting back to me and sorry for my late reply.
>
> On Thu, 2025-09-04 at 11:22 +0200, Juri Lelli wrote:
> > Hi Marcel,
> >
> > On 02/09/25 18:49, Marcel Ziswiler wrote:
> > > Hi
> > >
> > > On Tue, 2025-09-02 at 16:06 +0200, Marcel Ziswiler wrote:
> > > > As part of our trustable work [1], we also run a lot of real time scheduler (SCHED_DEADLINE) tests on the
> > > > mainline Linux kernel (v6.16.2 in below reported case).
> > >
> > > Looking through more logs from earlier test runs I found similar WARN_ONs dating back as early as v6.15.3.
> > > So
> > > it does not look like a "new" issue in that sense.
> > >
> > > [snip]
> > >
> > > Any help is much appreciated. Thanks!
> >
> > What's the actual workload composition leading the warning. I noticed
> > stress-ng in the report. Could you please share more details?
>
> Yes, sure. It's actually the exact same workload as related to the regression I reported back in April [1].
Ah, OK. So it's the workload I reproduced (with rt-app) while working on
fixing that issues, and I didn't hit the WARN.
I am thinking that we might try to get more info about what's going on
by adding some trace_printks for balance callbacks and then stopping
tracing in case a WARN is hit.
Could you please, with the following, start tracing for sched_switch,
sched_migrate_task and sched_wakeup and add traceoff_on_warning to
kernel cmdline. Then share what was collected.
Thanks!
Juri
---
kernel/sched/core.c | 4 ++++
kernel/sched/sched.h | 2 ++
2 files changed, 6 insertions(+)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index da2062de97a2..7b32828b94bc 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5011,6 +5011,7 @@ static void do_balance_callbacks(struct rq *rq, struct balance_callback *head)
head->next = NULL;
head = next;
+ trace_printk("cpu=%d callback=%pS\n", rq->cpu, func);
func(rq);
}
}
@@ -8211,6 +8212,7 @@ static void balance_push(struct rq *rq)
/*
* Ensure the thing is persistent until balance_push_set(.on = false);
*/
+ trace_printk("cpu=%d callback=%pS\n", rq->cpu, &balance_push_callback);
rq->balance_callback = &balance_push_callback;
/*
@@ -8273,8 +8275,10 @@ static void balance_push_set(int cpu, bool on)
rq_lock_irqsave(rq, &rf);
if (on) {
WARN_ON_ONCE(rq->balance_callback);
+ trace_printk("cpu=%d on=%d callback=%pS\n", rq->cpu, on, &balance_push_callback);
rq->balance_callback = &balance_push_callback;
} else if (rq->balance_callback == &balance_push_callback) {
+ trace_printk("cpu=%d on=%d callback=%pS\n", rq->cpu, on, NULL);
rq->balance_callback = NULL;
}
rq_unlock_irqrestore(rq, &rf);
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index b5367c514c14..f91fc2d36c81 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1963,6 +1963,8 @@ queue_balance_callback(struct rq *rq,
if (unlikely(head->next || rq->balance_callback == &balance_push_callback))
return;
+ trace_printk("cpu=%d callback=%pS\n", rq->cpu, func);
+
head->func = func;
head->next = rq->balance_callback;
rq->balance_callback = head;
--
Powered by blists - more mailing lists