[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <86fbf707-9ecf-4941-ae70-3332c360533d@linux.ibm.com>
Date: Wed, 8 Oct 2025 23:39:11 +0530
From: Shrikanth Hegde <sshegde@...ux.ibm.com>
To: Peter Zijlstra <peterz@...radead.org>,
Venkat Rao Bagalkote <venkat88@...ux.ibm.com>
Cc: LKML <linux-kernel@...r.kernel.org>,
linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>,
Madhavan Srinivasan <maddy@...ux.ibm.com>, jstultz@...gle.com,
stultz@...gle.com
Subject: Re: [bisected][mainline]Kernel warnings at
kernel/sched/cpudeadline.c:219
On 10/8/25 4:43 PM, Peter Zijlstra wrote:
> On Wed, Oct 08, 2025 at 03:47:16PM +0530, Shrikanth Hegde wrote:
>>
>>
>> On 10/8/25 3:20 PM, Peter Zijlstra wrote:
>>> On Wed, Oct 08, 2025 at 07:41:10AM +0530, Venkat Rao Bagalkote wrote:
>>>> Greetings!!!
>>>>
>>>>
>>>> IBM CI has reported a kernel warnings while running CPU hot plug operation
>>>> on IBM Power9 system.
>>>>
>>>>
>>>> Command to reproduce the issue:
>>>>
>>>> drmgr -c cpu -r -q 1
>>>>
>
> I do not know what drmgr is. I am not familiar with PowerPC tools.
> AFAICT x86 never modifies cpu_present_mask after boot.
>
It is a tool which allows dynamic addition of cpu/memory. It does indeed changes the present cpus.
Even i am not profound with it :)
>> maybe during drmgr, the dl server gets started again? Maybe that's why above patch it didn't work.
>> Will see and understand this bit more.
>
> dl_server is per cpu and is started on enqueue of a fair task when:
>
> - the runqueue was empty; and
> - the dl_server wasn't already active
>
> Once the dl_server is active it has this timer (you already found this),
> this timer is set for the 0-laxity moment (the last possible moment in
> time where it can still run its budget and not be late), during this
> time any fair runtime is accounted against it budget (subtracted from).
>
> Once the timer fires and it still has budget left; it will enqueue the
> deadline entity. However the more common case is that its budget will be
> depleted, in which case the timer is reset to its period end for
> replenish (where it gets new runtime budget), after which its back to
> the 0-laxity.
>
> If the deadline entity gets scheduled, it will try and pick a fair task
> and run that. In the case where there is no fair task, it will
> deactivate itself.
ok cool.
>
> The patch I sent earlier would force stop the deadline timer on CPU
> offline.
>
>
>> Also, i tried this below diff which fixes it. Just ignore the hrtimer if the cpu is offline.
>> Does this makes sense?
>> ---
>>
>> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>> index 615411a0a881..a342cf5e4624 100644
>> --- a/kernel/sched/deadline.c
>> +++ b/kernel/sched/deadline.c
>> @@ -1160,6 +1160,9 @@ static enum hrtimer_restart dl_server_timer(struct hrtimer *timer, struct sched_
>> scoped_guard (rq_lock, rq) {
>> struct rq_flags *rf = &scope.rf;
>> + if (!cpu_online(rq->cpu))
>> + return HRTIMER_NORESTART;
>> +
>> if (!dl_se->dl_throttled || !dl_se->dl_runtime)
>> return HRTIMER_NORESTART;
>
> This could leave the dl_server in inconsistent state. It would have to
> call dl_server_stop() or something along those lines.
>
> Also, this really should not happen; per my previous patch we should be
> stopping the timer when we go offline.
>
> Since you can readily reproduce this; perhaps you could stick something
> like this in dl_server_start():
>
> WARN_ON_ONCE(!cpu_online(rq->cpu))
>
> See if anybody is (re)starting the thing ?
So i did use this diff to get who is enabling it again, after it was stopped in offline.
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 198d2dd45f59..83e77bbbb6b4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -8328,6 +8328,8 @@ static inline void sched_set_rq_offline(struct rq *rq, int cpu)
BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
set_rq_offline(rq);
}
+ dl_server_stop(&rq->fair_server);
+
rq_unlock_irqrestore(rq, &rf);
}
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 615411a0a881..5847540bdc18 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1582,6 +1582,8 @@ void dl_server_start(struct sched_dl_entity *dl_se)
if (!dl_server(dl_se) || dl_se->dl_server_active)
return;
+ WARN_ON(!rq->online);
+
dl_se->dl_server_active = 1;
enqueue_dl_entity(dl_se, ENQUEUE_WAKEUP);
if (!dl_task(dl_se->rq->curr) || dl_entity_preempt(dl_se, &rq->curr->dl))
*It pointed to this*
NIP [c0000000001fd798] dl_server_start+0x50/0xd8
LR [c0000000001d9534] enqueue_task_fair+0x228/0x8ec
Call Trace:
[c000006684a579c0] [0000000000000001] 0x1 (unreliable)
[c000006684a579f0] [c0000000001d9534] enqueue_task_fair+0x228/0x8ec
[c000006684a57a60] [c0000000001bb344] enqueue_task+0x5c/0x1c8
[c000006684a57aa0] [c0000000001c5fc0] ttwu_do_activate+0x98/0x2fc
[c000006684a57af0] [c0000000001c671c] try_to_wake_up+0x2e0/0xa60
[c000006684a57b80] [c00000000019fb48] kthread_park+0x7c/0xf0
[c000006684a57bb0] [c00000000015fefc] takedown_cpu+0x60/0x194
[c000006684a57c00] [c000000000161924] cpuhp_invoke_callback+0x1f4/0x9a4
[c000006684a57c90] [c0000000001621a4] __cpuhp_invoke_callback_range+0xd0/0x188
[c000006684a57d30] [c000000000165aec] _cpu_down+0x19c/0x560
[c000006684a57df0] [c0000000001637c0] __cpu_down_maps_locked+0x2c/0x3c
[c000006684a57e10] [c00000000018a100] work_for_cpu_fn+0x38/0x54
[c000006684a57e40] [c00000000019075c] process_one_work+0x1d8/0x554
[c000006684a57ef0] [c00000000019165c] worker_thread+0x308/0x46c
[c000006684a57f90] [c00000000019e474] kthread+0x16c/0x19c
[c000006684a57fe0] [c00000000000dd58] start_kernel_thread+0x14/0x18
It is takedown_cpu called from CPU0(boot CPU) and it wakes up kthread which is CPU Bound I guess.
Since happens after rq was marked offline, it ends up starting the deadline server again.
So i think it is sensible idea to stop the deadline server if the cpu is going down.
Once we stop the server we will return HRTIMER_NORESTART.
This does fix the warning. Does this look any good?
---
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 615411a0a881..831797b9ec0f 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1160,11 +1160,14 @@ static enum hrtimer_restart dl_server_timer(struct hrtimer *timer, struct sched_
scoped_guard (rq_lock, rq) {
struct rq_flags *rf = &scope.rf;
+ update_rq_clock(rq);
+ if (!cpu_online(rq->cpu))
+ dl_server_stop(dl_se);
+
if (!dl_se->dl_throttled || !dl_se->dl_runtime)
return HRTIMER_NORESTART;
sched_clock_tick();
- update_rq_clock(rq);
if (!dl_se->dl_runtime)
return HRTIMER_NORESTART;
----
Also below check is duplicate. We do the same check above.
if (!dl_se->dl_runtime)
return HRTIMER_NORESTART;
Powered by blists - more mailing lists