linux-kernel - Re: [bisected][mainline]Kernel warnings at kernel/sched/cpudeadline.c:219

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <86fbf707-9ecf-4941-ae70-3332c360533d@linux.ibm.com>
Date: Wed, 8 Oct 2025 23:39:11 +0530
From: Shrikanth Hegde <sshegde@...ux.ibm.com>
To: Peter Zijlstra <peterz@...radead.org>,
        Venkat Rao Bagalkote <venkat88@...ux.ibm.com>
Cc: LKML <linux-kernel@...r.kernel.org>,
        linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>,
        Madhavan Srinivasan <maddy@...ux.ibm.com>, jstultz@...gle.com,
        stultz@...gle.com
Subject: Re: [bisected][mainline]Kernel warnings at
 kernel/sched/cpudeadline.c:219



On 10/8/25 4:43 PM, Peter Zijlstra wrote:
> On Wed, Oct 08, 2025 at 03:47:16PM +0530, Shrikanth Hegde wrote:
>>
>>
>> On 10/8/25 3:20 PM, Peter Zijlstra wrote:
>>> On Wed, Oct 08, 2025 at 07:41:10AM +0530, Venkat Rao Bagalkote wrote:
>>>> Greetings!!!
>>>>
>>>>
>>>> IBM CI has reported a kernel warnings while running CPU hot plug operation
>>>> on IBM Power9 system.
>>>>
>>>>
>>>> Command to reproduce the issue:
>>>>
>>>> drmgr -c cpu -r -q 1
>>>>

> 
> I do not know what drmgr is. I am not familiar with PowerPC tools.
> AFAICT x86 never modifies cpu_present_mask after boot.
> 

It is a tool which allows dynamic addition of cpu/memory. It does indeed changes the present cpus.
Even i am not profound with it :)

>> maybe during drmgr, the dl server gets started again? Maybe that's why above patch it didn't work.
>> Will see and understand this bit more.
> 
> dl_server is per cpu and is started on enqueue of a fair task when:
> 
>    - the runqueue was empty; and
>    - the dl_server wasn't already active
> 
> Once the dl_server is active it has this timer (you already found this),
> this timer is set for the 0-laxity moment (the last possible moment in
> time where it can still run its budget and not be late), during this
> time any fair runtime is accounted against it budget (subtracted from).
> 
> Once the timer fires and it still has budget left; it will enqueue the
> deadline entity. However the more common case is that its budget will be
> depleted, in which case the timer is reset to its period end for
> replenish (where it gets new runtime budget), after which its back to
> the 0-laxity.
> 
> If the deadline entity gets scheduled, it will try and pick a fair task
> and run that. In the case where there is no fair task, it will
> deactivate itself.

ok cool.

> 
> The patch I sent earlier would force stop the deadline timer on CPU
> offline.
> 
> 
>> Also, i tried this below diff which fixes it. Just ignore the hrtimer if the cpu is offline.
>> Does this makes sense?
>> ---
>>
>> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>> index 615411a0a881..a342cf5e4624 100644
>> --- a/kernel/sched/deadline.c
>> +++ b/kernel/sched/deadline.c
>> @@ -1160,6 +1160,9 @@ static enum hrtimer_restart dl_server_timer(struct hrtimer *timer, struct sched_
>>   	scoped_guard (rq_lock, rq) {
>>   		struct rq_flags *rf = &scope.rf;
>> +		if (!cpu_online(rq->cpu))
>> +			return HRTIMER_NORESTART;
>> +
>>   		if (!dl_se->dl_throttled || !dl_se->dl_runtime)
>>   			return HRTIMER_NORESTART;
> 
> This could leave the dl_server in inconsistent state. It would have to
> call dl_server_stop() or something along those lines.
> 
> Also, this really should not happen; per my previous patch we should be
> stopping the timer when we go offline.
> 
> Since you can readily reproduce this; perhaps you could stick something
> like this in dl_server_start():
> 
> 	WARN_ON_ONCE(!cpu_online(rq->cpu))
> 
> See if anybody is (re)starting the thing ?

So i did use this diff to get who is enabling it again, after it was stopped in offline.

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 198d2dd45f59..83e77bbbb6b4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -8328,6 +8328,8 @@ static inline void sched_set_rq_offline(struct rq *rq, int cpu)
                 BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
                 set_rq_offline(rq);
         }
+       dl_server_stop(&rq->fair_server);
+
         rq_unlock_irqrestore(rq, &rf);
  }
  
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 615411a0a881..5847540bdc18 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1582,6 +1582,8 @@ void dl_server_start(struct sched_dl_entity *dl_se)
         if (!dl_server(dl_se) || dl_se->dl_server_active)
                 return;
  
+       WARN_ON(!rq->online);
+
         dl_se->dl_server_active = 1;
         enqueue_dl_entity(dl_se, ENQUEUE_WAKEUP);
         if (!dl_task(dl_se->rq->curr) || dl_entity_preempt(dl_se, &rq->curr->dl))

*It pointed to this*

NIP [c0000000001fd798] dl_server_start+0x50/0xd8
LR [c0000000001d9534] enqueue_task_fair+0x228/0x8ec
Call Trace:
[c000006684a579c0] [0000000000000001] 0x1 (unreliable)
[c000006684a579f0] [c0000000001d9534] enqueue_task_fair+0x228/0x8ec
[c000006684a57a60] [c0000000001bb344] enqueue_task+0x5c/0x1c8
[c000006684a57aa0] [c0000000001c5fc0] ttwu_do_activate+0x98/0x2fc
[c000006684a57af0] [c0000000001c671c] try_to_wake_up+0x2e0/0xa60
[c000006684a57b80] [c00000000019fb48] kthread_park+0x7c/0xf0
[c000006684a57bb0] [c00000000015fefc] takedown_cpu+0x60/0x194
[c000006684a57c00] [c000000000161924] cpuhp_invoke_callback+0x1f4/0x9a4
[c000006684a57c90] [c0000000001621a4] __cpuhp_invoke_callback_range+0xd0/0x188
[c000006684a57d30] [c000000000165aec] _cpu_down+0x19c/0x560
[c000006684a57df0] [c0000000001637c0] __cpu_down_maps_locked+0x2c/0x3c
[c000006684a57e10] [c00000000018a100] work_for_cpu_fn+0x38/0x54
[c000006684a57e40] [c00000000019075c] process_one_work+0x1d8/0x554
[c000006684a57ef0] [c00000000019165c] worker_thread+0x308/0x46c
[c000006684a57f90] [c00000000019e474] kthread+0x16c/0x19c
[c000006684a57fe0] [c00000000000dd58] start_kernel_thread+0x14/0x18

It is takedown_cpu called from CPU0(boot CPU) and it wakes up kthread which is CPU Bound I guess.
Since happens after rq was marked offline, it ends up starting the deadline server again.

So i think it is sensible idea to stop the deadline server if the cpu is going down.
Once we stop the server we will return HRTIMER_NORESTART.

This does fix the warning. Does this look any good?

---
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 615411a0a881..831797b9ec0f 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1160,11 +1160,14 @@ static enum hrtimer_restart dl_server_timer(struct hrtimer *timer, struct sched_
         scoped_guard (rq_lock, rq) {
                 struct rq_flags *rf = &scope.rf;
  
+               update_rq_clock(rq);
+               if (!cpu_online(rq->cpu))
+                       dl_server_stop(dl_se);
+
                 if (!dl_se->dl_throttled || !dl_se->dl_runtime)
                         return HRTIMER_NORESTART;
  
                 sched_clock_tick();
-               update_rq_clock(rq);
  
                 if (!dl_se->dl_runtime)
                         return HRTIMER_NORESTART;

----

Also below check is duplicate. We do the same check above.

if (!dl_se->dl_runtime)
                         return HRTIMER_NORESTART;