[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20241127055610.7076-1-adamli@os.amperecomputing.com>
Date: Wed, 27 Nov 2024 05:56:07 +0000
From: Adam Li <adamli@...amperecomputing.com>
To: peterz@...radead.org,
mingo@...hat.com,
juri.lelli@...hat.com,
vincent.guittot@...aro.org
Cc: dietmar.eggemann@....com,
rostedt@...dmis.org,
bsegall@...gle.com,
mgorman@...e.de,
vschneid@...hat.com,
linux-kernel@...r.kernel.org,
patches@...erecomputing.com,
cl@...ux.com,
christian.loehle@....com,
vineethr@...ux.ibm.com,
Adam Li <adamli@...amperecomputing.com>
Subject: [PATCH v2 0/3] sched/fair: Fix NEXT_BUDDY panic and warning
When running Specjbb workload with NEXT_BUDDY enabled, kernel warning,
rcu stall and panic may be triggered.
The kernel panic is triggered in pick_next_entity() if pick_eevdf()
returns NULL.
In patch 1 ("Fix warning if NEXT_BUDDY enabled"), if sched_delayed is set
we do not set next buddy. After the patch, kernel warning, rcu stall and
panic disappear. However to avoid panic, we still check return value of
pick_eevdf().
The 'last' and 'skip' buddy are obsoleted by EEVDF. Update the comments in
pick_next_entity().
Detail log bellow:
[ 124.972623] ------------[ cut here ]------------
[ 124.977300] cfs_rq->next->sched_delayed
[ 124.977310] WARNING: CPU: 51 PID: 2150 at kernel/sched/fair.c:5621 pick_task_fair+0x130/0x150
[ 125.049547] CPU: 51 UID: 0 PID: 2150 Comm: kworker/51:1 Tainted: G E 6.12.0.adam+ #1
[ 125.058678] Tainted: [E]=UNSIGNED_MODULE
[ 125.062591] Hardware name: IEI NF5280R7/Mitchell MB, BIOS 4.4.3.1 10/16/2024
[ 125.069629] Workqueue: 0x0 (mm_percpu_wq)
[ 125.073721] pstate: 634000c9 (nZCv daIF +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
[ 125.080671] pc : pick_task_fair+0x130/0x150
[ 125.084841] lr : pick_task_fair+0x130/0x150
[ 125.089013] sp : ffff8000ab41bc10
[ 125.092315] x29: ffff8000ab41bc10 x28: 0000000000000000 x27: 0000000000000000
[ 125.099440] x26: ffff000123bd8788 x25: 0000000000000402 x24: 0000000000000001
[ 125.106565] x23: ffff000123bd8000 x22: ffff007dfef5cd00 x21: ffff007dfef5cd80
[ 125.113689] x20: ffff007dfef5cd80 x19: ffff2001ab20a780 x18: 0000000000000006
[ 125.120815] x17: 0000000000000000 x16: 0000000000000000 x15: ffff8000ab41b5e0
[ 125.127938] x14: 0000000000000000 x13: 646579616c65645f x12: 64656863733e2d74
[ 125.135062] x11: fffffffffc000000 x10: ffff207dfac9b420 x9 : ffff80008014ed60
[ 125.142185] x8 : 00000000ffdfffff x7 : ffff207dfac80000 x6 : 000000000000122c
[ 125.149309] x5 : ffff007dfef49408 x4 : 40000000ffe0122c x3 : ffff807d7d673000
[ 125.156433] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000123bd8000
[ 125.163561] Call trace:
[ 125.165996] pick_task_fair+0x130/0x150 (P)
[ 125.170167] pick_task_fair+0x130/0x150 (L)
[ 125.174338] pick_next_task_fair+0x48/0x3c0
[ 125.178512] __pick_next_task+0x4c/0x220
[ 125.182426] pick_next_task+0x44/0x980
[ 125.186163] __schedule+0x3d0/0x628
[ 125.189645] schedule+0x3c/0xe0
[ 125.192776] worker_thread+0x1a4/0x368
[ 125.196516] kthread+0xfc/0x110
[ 125.199647] ret_from_fork+0x10/0x20
[ 125.203213] ---[ end trace 0000000000000000 ]---
[ 125.207818] ------------[ cut here ]------------
[ 211.151849] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 211.159759] rcu: (detected by 141, t=15003 jiffies, g=5629, q=26516 ncpus=384)
[ 211.169780] rcu: All QSes seen, last rcu_preempt kthread activity 15002 (4294943634-4294928632), jiffies_till_next_fqs=2, root ->qsmask 0x0
[ 211.185062] rcu: rcu_preempt kthread timer wakeup didn't happen for 15004 jiffies! g5629 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0
[ 211.199043] rcu: Possible timer handling issue on cpu=352 timer-softirq=1091
[ 211.208943] rcu: rcu_preempt kthread starved for 15012 jiffies! g5629 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=352
[ 211.222141] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
[ 211.234037] rcu: RCU grace-period kthread stack dump:
[ 211.241854] task:rcu_preempt state:R running task stack:0 pid:17 tgid:17 ppid:2 flags:0x00000008
[ 211.255487] Call trace:
[ 211.260698] __switch_to+0xf0/0x150 (T)
[ 211.267299] __schedule+0x238/0x628
[ 211.273553] schedule+0x3c/0xe0
[ 211.279459] schedule_timeout+0x88/0x108
[ 211.286147] rcu_gp_fqs_loop+0x158/0x4d0
[ 211.292835] rcu_gp_kthread+0x164/0x198
[ 211.299436] kthread+0xfc/0x110
[ 211.305342] ret_from_fork+0x10/0x20
[ 211.311684] rcu: Stack dump where RCU GP kthread last ran:
[ 211.319940] Sending NMI from CPU 141 to CPUs 352:
[ 211.327411] NMI backtrace for cpu 352
[ 211.333835] CPU: 352 UID: 0 PID: 0 Comm: swapper/352 Tainted: G W E 6.12.0.adam+ #1
[ 211.345466] Tainted: [W]=WARN, [E]=UNSIGNED_MODULE
[ 211.353021] Hardware name: IEI NF5280R7/Mitchell MB, BIOS 4.4.3.1 10/16/2024
[ 211.362834] pstate: 63400009 (nZCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
[ 211.372557] pc : cpuidle_enter_state+0xcc/0x4f0
[ 211.379851] lr : cpuidle_enter_state+0xc0/0x4f0
[ 211.387147] sp : ffff8000878c3d70
[ 211.393226] x29: ffff8000878c3d70 x28: 0000000000000000 x27: 0000000000000000
[ 211.403125] x26: 0000000000000000 x25: 0000000000000000 x24: 0000003133e26b84
[ 211.413023] x23: 0000000000000000 x22: ffff800082092d98 x21: 00000031341844e8
[ 211.422922] x20: 0000000000000000 x19: ffff20011d459800 x18: 0000000000000000
[ 211.432820] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[ 211.442719] x14: 0000000000000000 x13: ffff800081dcf030 x12: 0000000000000001
[ 211.452619] x11: 0000003131433c14 x10: 071c71c71c71c71c x9 : ffff8000810b5900
[ 211.462517] x8 : 00000000003bf790 x7 : ffff207e031d57e4 x6 : 0000002023a22338
[ 211.472416] x5 : 1fffffffffffffff x4 : 0000000000000015 x3 : 000000000030e5a8
[ 211.482314] x2 : ffffa07d818ed000 x1 : ffff207e031d6d00 x0 : 0000000000000000
[ 211.492213] Call trace:
[ 211.497426] cpuidle_enter_state+0xcc/0x4f0 (P)
[ 211.504719] cpuidle_enter_state+0xc0/0x4f0 (L)
[ 211.512013] cpuidle_enter+0x40/0x60
[ 211.518351] cpuidle_idle_call+0x130/0x1c8
[ 211.525212] do_idle+0xec/0xf8
[ 211.531033] cpu_startup_entry+0x40/0x50
[ 211.537719] secondary_start_kernel+0xe0/0x120
[ 211.544926] __secondary_switched+0xc0/0xc8
[ 297.371198] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000051
[ 297.406112] CPU: 116 UID: 0 PID: 10328 Comm: Grizzly-worker( Tainted: G W E 6.12.0.adam+ #1
[ 297.414884] Mem abort info:
[ 297.424437] Tainted: [W]=WARN, [E]=UNSIGNED_MODULE
[ 297.427219] ESR = 0x0000000096000005
[ 297.431997] Hardware name: IEI NF5280R7/Mitchell MB, BIOS 4.4.3.1 10/16/2024
[ 297.435734] EC = 0x25: DABT (current EL), IL = 32 bits
[ 297.442770] pstate: a34000c9 (NzCv daIF +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
[ 297.448069] SET = 0, FnV = 0
[ 297.455018] pc : pick_task_fair+0x50/0x150
[ 297.458060] EA = 0, S1PTW = 0
[ 297.462144] lr : pick_task_fair+0x50/0x150
[ 297.465274] FSC = 0x05: level 1 translation fault
[ 297.469358] sp : ffff800101d93ae0
[ 297.474223] Data abort info:
[ 297.477526] x29: ffff800101d93ae0
[ 297.480395] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
[ 297.480395] x28: 0000000000000009
[ 297.483703] x27: 0000000000000000
[ 297.489177] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 297.492567]
[ 297.495956] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 297.500996] x26: ffff006da4381b08
[ 297.502477] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000198b3b000
[ 297.507777] x25: 0000000000000080
[ 297.511080] [0000000000000051] pgd=08000001c0636403
[ 297.517509] x24: 0000000000000001
[ 297.520899] , p4d=08000001c0636403
[ 297.525765]
[ 297.529155] , pud=0000000000000000
[ 297.532545] x23: ffff006da4381380
[ 297.534025]
[ 297.537415] x22: ffff007dff7fed00 x21: ffff007dff7fed80
[ 297.547496] x20: ffff000167f60c00 x19: 0000000000000000 x18: 0000000000000006
[ 297.554621] x17: ffff8000820b3be8 x16: 0000000087c17f9e x15: ffff800083d53690
[ 297.561745] x14: 0000000000000004 x13: ffff800081df4ac8 x12: 0000000000000000
[ 297.568868] x11: ffff200111a3f0b0 x10: ffff200111a3efc8 x9 : ffff800080109e48
[ 297.575992] x8 : 00000000000000b8 x7 : 0000000000000074 x6 : 0000000000000002
[ 297.583115] x5 : 0000000000000002 x4 : 0000000000000002 x3 : 0000000000000000
[ 297.590239] x2 : fffffffffffffff0 x1 : 0000000000000000 x0 : 0000000000000000
[ 297.597362] Call trace:
[ 297.599795] pick_task_fair+0x50/0x150 (P)
[ 297.603879] pick_task_fair+0x50/0x150 (L)
[ 297.607963] pick_next_task_fair+0x30/0x3c0
[ 297.612134] __pick_next_task+0x4c/0x220
[ 297.616045] pick_next_task+0x44/0x980
[ 297.619782] __schedule+0x3d0/0x628
[ 297.623259] do_task_dead+0x50/0x60
[ 297.626736] do_exit+0x28c/0x410
[ 297.629955] do_group_exit+0x3c/0xa0
[ 297.633518] get_signal+0x8c4/0x8d0
[ 297.636996] do_signal+0x9c/0x270
[ 297.640299] do_notify_resume+0xe0/0x198
[ 297.644212] el0_svc+0xf4/0x170
[ 297.647342] el0t_64_sync_handler+0x10c/0x138
[ 297.651687] el0t_64_sync+0x1ac/0x1b0
[ 297.655339] Code: d503201f 1400002a aa1403e0 97ffda0b (39414401)
[ 297.661439] ---[ end trace 0000000000000000 ]---
[ 297.726593] Kernel panic - not syncing: Oops: Fatal exception
v2:
Follow Christian Loehle's suggestion, revise commit message.
Add patch to check return value of pick_eevdf() in pick_next_entity().
v1:
https://lore.kernel.org/all/20241125021222.356881-1-adamli@os.amperecomputing.com/
Adam Li (3):
sched/fair: Fix warning if NEXT_BUDDY enabled
sched/fair: Fix panic if pick_eevdf() returns NULL
sched/fair: Update comments regarding last and skip buddy
kernel/sched/fair.c | 12 ++++--------
1 file changed, 4 insertions(+), 8 deletions(-)
--
2.25.1
Powered by blists - more mailing lists