[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <47b4a790-9a27-2fc5-f2aa-f9981c6da015@huawei.com>
Date: Sun, 7 Apr 2024 22:06:43 +0800
From: "zhaowenhui (A)" <zhaowenhui8@...wei.com>
To: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>, Vincent Guittot
<vincent.guittot@...aro.org>, Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Mel
Gorman <mgorman@...e.de>, Daniel Bristot de Oliveira <bristot@...hat.com>,
Valentin Schneider <vschneid@...hat.com>, "open list:SCHEDULER"
<linux-kernel@...r.kernel.org>
Subject: [bug report] WARNING: CPU: 0 PID: 49573 at kernel/sched/rt.c:802
rq_offline_rt+0x24d/0x260
Hello,
Recently, our machine triggered a warning in __disable_runtime. The
dmesg are as follow:
[ 991.697692] WARNING: CPU: 0 PID: 49573 at kernel/sched/rt.c:802
rq_offline_rt+0x24d/0x260
[ 991.697795] CPU: 0 PID: 49573 Comm: kworker/1:0 Kdump: loaded Not
tainted 6.9.0-rc1+ #4
[ 991.697798] Hardware name: SuperCloud R5210 G12/X12DPi-N6, BIOS 1.1c
08/30/2021
[ 991.697800] Workqueue: events cpuset_hotplug_workfn
[ 991.697803] RIP: 0010:rq_offline_rt+0x24d/0x260
[ 991.697825] Call Trace:
[ 991.697827] <TASK>
[ 991.697830] ? __warn+0x7c/0x130
[ 991.697835] ? rq_offline_rt+0x24d/0x260
[ 991.697837] ? report_bug+0xf8/0x1e0
[ 991.697842] ? handle_bug+0x3f/0x70
[ 991.697858] set_rq_offline.part.125+0x2d/0x70
[ 991.697864] rq_attach_root+0xda/0x110
[ 991.697867] cpu_attach_domain+0x433/0x860
[ 991.697870] ? psi_task_switch+0x11d/0x260
[ 991.697873] ? __kmalloc_node+0x1dc/0x390
[ 991.697877] ? alloc_cpumask_var_node+0x1b/0x30
[ 991.697880] partition_sched_domains_locked+0x2a8/0x3a0
[ 991.697883] ? css_next_child+0x61/0x80
[ 991.697885] rebuild_sched_domains_locked+0x608/0x800
[ 991.697890] ? percpu_rwsem_wait+0x160/0x160
[ 991.697895] rebuild_sched_domains+0x1b/0x30
[ 991.697897] cpuset_hotplug_workfn+0x4b6/0x1160
[ 991.697899] ? balance_push+0x4e/0x120
[ 991.697903] ? finish_task_switch+0x8d/0x2d0
[ 991.697905] ? __switch_to+0x126/0x4f0
[ 991.697909] process_scheduled_works+0xad/0x430
[ 991.697917] worker_thread+0x105/0x270
[ 991.697922] kthread+0xde/0x110
[ 991.697928] ret_from_fork+0x2d/0x50
[ 991.697935] ret_from_fork_asm+0x11/0x20
[ 991.697940] </TASK>
[ 991.697941] ---[ end trace 0000000000000000 ]---
The corresponding code is :
802 WARN_ON_ONCE(want);
Because this WARN_ON_ONCE hasn’t changed from BUG_ON under linux-6.1, it
will trigger panic in those version.
More information:
1. RT_RUNTIME_SHARE is enabled.
2. We continuously create and remove cpu cgroups. We use cgexec to do
some tasks like "tree" or "ps" in these cgroups and the rt_runtime_us in
these cgroups are set to 2000~6000.
3. There are frequent cpu offline/online operations, so it will trigger
__disable_runtime.
Every time we run these operations after reboot, this warning will
happen easily.
---
Regards
Zhao Wenhui
Powered by blists - more mailing lists