[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250925090001.GX4067720@noisy.programming.kicks-ass.net>
Date: Thu, 25 Sep 2025 11:00:01 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Christian Loehle <christian.loehle@....com>
Cc: tj@...nel.org, linux-kernel@...r.kernel.org, mingo@...hat.com,
juri.lelli@...hat.com, vincent.guittot@...aro.org,
dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
mgorman@...e.de, vschneid@...hat.com, longman@...hat.com,
hannes@...xchg.org, mkoutny@...e.com, void@...ifault.com,
arighi@...dia.com, changwoo@...lia.com, cgroups@...r.kernel.org,
sched-ext@...ts.linux.dev, liuwenfang@...or.com, tglx@...utronix.de
Subject: Re: [PATCH 00/14] sched: Support shared runqueue locking
On Thu, Sep 18, 2025 at 04:15:45PM +0100, Christian Loehle wrote:
> Hi Peter, A couple of issues popped up when testing this [1] (that don't trigger on [2]):
>
> When booting (arm64 orion o6) I get:
>
> [ 1.298020] sched: DL replenish lagged too much
> [ 1.298364] ------------[ cut here ]------------
> [ 1.298377] WARNING: CPU: 4 PID: 0 at kernel/sched/deadline.c:239 inactive_task_timer+0x3d0/0x474
Ah, right. Robot reported this one too. I'll look into it. Could be one
of the patches in sched/urgent cures it, but who knows. I'll have a
poke.
> and when running actual tests (e.g. iterating through all scx schedulers under load):
>
> [ 146.532691] ================================
> [ 146.536947] WARNING: inconsistent lock state
> [ 146.541204] 6.17.0-rc4-cix-build+ #58 Tainted: G S W
> [ 146.547457] --------------------------------
> [ 146.551713] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
> [ 146.557705] rcu_tasks_trace/79 [HC0[0]:SC0[0]:HE0:SE1] takes:
> [ 146.563438] ffff000089c90e58 (&dsq->lock){?.-.}-{2:2}, at: __task_rq_lock+0x88/0x194
> [ 146.840178]
> [ 146.813242] #0: ffff800082e6e650 (rcu_tasks_trace.tasks_gp_mutex){+.+.}-{4:4}, at: rcu_tasks_one_gp+0x328/0x570
> [ 146.823403] #1: ffff800082adc1f0 (cpu_hotplug_lock){++++}-{0:0}, at: cpus_read_lock+0x10/0x1c
> [ 146.832014] #2: ffff000089c90e58 (&dsq->lock){?.-.}-{2:2}, at: __task_rq_lock+0x88/0x194
>
> [ 146.840178] stack backtrace:
> [ 146.844521] CPU: 10 UID: 0 PID: 79 Comm: rcu_tasks_trace Tainted: G S W 6.17.0-rc4-cix-build+ #58 PREEMPT
> [ 146.855463] Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN
> [ 146.860240] Hardware name: Radxa Computer (Shenzhen) Co., Ltd. Radxa Orion O6/Radxa Orion O6, BIOS 0.3.0-1 2025-04-28T03:35:34+00:00
> [ 146.872136] Sched_ext: simple (enabled+all), task: runnable_at=-4ms
> [ 146.872138] Call trace:
> [ 146.880822] show_stack+0x18/0x24 (C)
> [ 146.884471] dump_stack_lvl+0x90/0xd0
> [ 146.888131] dump_stack+0x18/0x24
> [ 146.891432] print_usage_bug.part.0+0x29c/0x364
> [ 146.895950] mark_lock+0x778/0x978
> [ 146.899338] mark_held_locks+0x58/0x90
> [ 146.903074] lockdep_hardirqs_on_prepare+0x100/0x210
> [ 146.908025] trace_hardirqs_on+0x5c/0x1cc
> [ 146.912025] _raw_spin_unlock_irqrestore+0x6c/0x70
> [ 146.916803] task_call_func+0x110/0x164
Ooh, yeah, that's buggered. Let me go fix!
Thanks for testing!
Powered by blists - more mailing lists