linux-kernel - Re: sched: observed instability under stress in 6.12 and mainline

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20251013030309.2176524-1-jiping.ma2@windriver.com>
Date: Mon, 13 Oct 2025 03:03:09 +0000
From: Jiping Ma <jiping.ma2@...driver.com>
To: jiahao.kernel@...il.com
Cc: chris.friesen@...driver.com, jiping.ma2@...driver.com,
        hanguangjiang@...iang.com, linux-kernel@...r.kernel.org,
        osandov@...com, peterz@...radead.org
Subject: Re: sched: observed instability under stress in 6.12 and mainline

>> Hi,
>> 
>> I'd like to draw the attention of the scheduler maintainers to a number 
>> of kernel bugzilla reports submitted by a colleague a couple of weeks ago:
>> 
>> 6.12.18:
>> https://bugzilla.kernel.org/show_bug.cgi?id=220447
>> https://bugzilla.kernel.org/show_bug.cgi?id=220448
>> 
>> v6.16-rt3
>> https://bugzilla.kernel.org/show_bug.cgi?id=220450
>> https://bugzilla.kernel.org/show_bug.cgi?id=220449
>> 
>> There seems to be something wrong with either the logic or the locking. 
>> In one case this resulted in a NULL pointer dereference in 
>> pick_next_entity().  In another case it resulted in 
>> BUG_ON(!rq->nr_running) in dequeue_top_rt_rq() and 
>> SCHED_WARN_ON(!se->on_rq) in update_entity_lag().
>> 
>> My colleague suggests that the NULL pointer dereference may be due to 
>> pick_eevdf() returning NULL in pick_next_entity().
>> 
>> I did some digging and found that 
>> https://gitlab.com/linux-kernel/stable/-/commit/86b37810 would not have 
>> been included in 6.12.18, but the equivalent fix should have been in the 
>> 6.16 load.
>> 
>> We haven't yet bottomed out the root cause.
>> 
>> Any suggestions or assistance would be appreciated.
>> 
>> Thanks,
>> Chris
>> 
>> 
>
>Maybe this patch can be useful for your problem.
>https://lore.kernel.org/all/tencent_3177343A3163451463643E434C61911B4208@qq.com/
>
>If I understand correctly, we may dequeue_entity twice in 
>rt_mutex_setprio()/__sched_setscheduler(). cfs_bandwidth may break the 
>state of p->on_rq and se->on_rq.

Thank veruy much!
https://lore.kernel.org/all/tencent_3177343A3163451463643E434C61911B4208@qq.com/ can fix the original panic 
https://bugzilla.kernel.org/show_bug.cgi?id=220447, now we encounter the other !se->on_rq WARNING.  Do you know
we already have the fix?

Any suggestions or assistance would be appreciated.

[ 1461.107139] [  T17007] !se->on_rq
[ 1461.107144] [  T17007] WARNING: CPU: 1 PID: 17007 at kernel/sched/fair.c:704 update_entity_lag+0x7c/0x90
......
[ 1461.107339] [  T17007] CPU: 1 UID: 0 PID: 17007 Comm: containerd Kdump: loaded Tainted: G           O       6.12.0-1-rt-amd64 #1  Debian 6.12.40-1.stx.130
[ 1461.107344] [  T17007] Tainted: [O]=OOT_MODULE
[ 1461.107345] [  T17007] Hardware name: Dell Inc. PowerEdge XR8720t/0K54D0, BIOS 0.2.4 [X-REV] 08/11/2025
[ 1461.107347] [  T17007] RIP: 0010:update_entity_lag+0x7c/0x90
[ 1461.107352] [  T17007] Code: 0f 4c fd 48 89 7b 78 5b 5d c3 cc cc cc cc 80 3d 63 cb d3 01 00 75 aa 48 c7 c7 62 b4 8c 9f c6 05 53 cb d3 01 01 e8 64 a6 fa ff <0f> 0b eb 93 48 89 de e8 f8 a3 ff ff 48 89 c7 eb b9 0f 1f 00 90 90
[ 1461.107355] [  T17007] RSP: 0018:ff77b604c779f828 EFLAGS: 00010082
[ 1461.107358] [  T17007] RAX: 0000000000000000 RBX: ff4e7dface2ab000 RCX: ff4e7e183baa0908
[ 1461.107360] [  T17007] RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ff4e7e183baa0900
[ 1461.107362] [  T17007] RBP: ff4e7df9c8e7ee00 R08: 0000000000000000 R09: ff77b604c779f7b8
[ 1461.107364] [  T17007] R10: 0000000000000001 R11: ff4e7e193edbb0a8 R12: 0000000000000009
[ 1461.107365] [  T17007] R13: 0000000000000001 R14: 0000000000000000 R15: ff4e7dface2ab100
[ 1461.107367] [  T17007] FS:  00007faad3d1e700(0000) GS:ff4e7e183ba80000(0000) knlGS:0000000000000000
[ 1461.107370] [  T17007] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1461.107372] [  T17007] CR2: 000055add20c7000 CR3: 000000022b1a6004 CR4: 0000000000773ef0
[ 1461.107373] [  T17007] PKRU: 55555554
[ 1461.107375] [  T17007] Call Trace:
[ 1461.107378] [  T17007]  <TASK>
[ 1461.107381] [  T17007]  dequeue_entity+0x95/0x600
[ 1461.107384] [  T17007]  dequeue_entities+0xc9/0x590
[ 1461.107387] [  T17007]  dequeue_task_fair+0xd5/0x1f0
[ 1461.107390] [  T17007]  ? sched_clock+0xc/0x30
[ 1461.107395] [  T17007]  detach_task+0x36/0x60
[ 1461.107399] [  T17007]  sched_balance_rq+0x77f/0xe70
[ 1461.107404] [  T17007]  sched_balance_newidle+0x1c8/0x430
[ 1461.107407] [  T17007]  pick_next_task_fair+0x2e/0x3c0
[ 1461.107410] [  T17007]  __schedule+0x269/0xbb0
[ 1461.107416] [  T17007]  ? hrtimer_start_range_ns+0x2e1/0x460
[ 1461.107421] [  T17007]  schedule+0x23/0xf0
[ 1461.107424] [  T17007]  do_nanosleep+0x65/0x150
[ 1461.107429] [  T17007]  hrtimer_nanosleep+0x7a/0xf0
[ 1461.107432] [  T17007]  ? __pfx_hrtimer_wakeup+0x10/0x10
[ 1461.107436] [  T17007]  __x64_sys_nanosleep+0xac/0xe0

Thanks,
Jiping


Thanks,
Hao