[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1980bc4cecf.4f3ffa421612787.2278126391533052891@linux.beauty>
Date: Tue, 15 Jul 2025 09:48:24 +0800
From: Li Chen <me@...ux.beauty>
To: "Ingo Molnar" <mingo@...hat.com>,
"Peter Zijlstra" <peterz@...radead.org>,
"Juri Lelli" <juri.lelli@...hat.com>,
"Vincent Guittot" <vincent.guittot@...aro.org>,
"Dietmar Eggemann" <dietmar.eggemann@....com>,
"Steven Rostedt" <rostedt@...dmis.org>,
"Ben Segall" <bsegall@...gle.com>, "Mel Gorman" <mgorman@...e.de>,
"Valentin Schneider" <vschneid@...hat.com>,
"linux-kernel" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] sched/debug: Add cond_resched() to sched_debug_show()
gentle ping
---- On Mon, 30 Jun 2025 14:08:26 +0800 Li Chen <me@...ux.beauty> wrote ---
> From: Li Chen <chenl311@...natelecom.cn>
>
> Running stress-ng on large CPUs (e.g., ≥256 cores) can
> spawn numerous process/threads (e.g., over 70w told from
> vmcore) and trigger softlockup watchdogs when read
> /sys/kernel/debug/sched/debug:
> https://github.com/ColinIanKing/stress-ng/blob/V0.18.10/stress-cpu-sched.c#L860
>
> To improve responsiveness during extensive debug dumps,
> insert cond_resched() into sched_debug_show(). This allows the
> kernel to periodically yield and remain responsive, similar to how
> cond_resched() is used in other iteration-heavy code paths.
>
> Below is soft lockup call trace:
>
> [ 1996.543070] RIP: 0010:print_cpu+0x2a4/0x770
> [ 1996.543084] Code: f6 ff ff 49 81 ff 58 fc c0 b6 74 69 49 8b 8f 58 03 00 00 48 8b 41 10 48 8d 51 10 48 8d 98 20 f5 ff ff 48 39 c2 74 37 8b 43 14 <39> c5 75 19 49 8b b5 10 0a 00 00 48 89 da 4c 89 e7 e8 d6 f1 ff ff
> [ 1996.543087] RSP: 0018:ffffc900704a7d40 EFLAGS: 00000202
> [ 1996.543090] RAX: 0000000000000038 RBX: ffff88b1b9073900 RCX: ffff88b326b86880
> [ 1996.543093] RDX: ffff88b326b86890 RSI: ffffffffb6527fde RDI: ffff88d579bd7256
> [ 1996.543096] RBP: 0000000000000000 R08: 0000000000000028 R09: ffff88d679bd722d
> [ 1996.543098] R10: ffffffffffffffff R11: 0000000000000000 R12: ffff88d4662cf880
> [ 1996.543099] R13: ffff889045e34d40 R14: ffff88b1b9073900 R15: ffff88b1b9074258
> [ 1996.543101] FS: 00007f0d2a254000(0000) GS:ffff88e04f080000(0000) knlGS:0000000000000000
> [ 1996.543104] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1996.543106] CR2: 0000000000b6d7c0 CR3: 00000033f894a000 CR4: 0000000000350ee0
> [ 1996.543108] Call Trace:
> [ 1996.543115] <TASK>
> [ 1996.543122] sched_debug_show+0x13/0x30
> [ 1996.543127] seq_read_iter+0x122/0x470
> [ 1996.543133] ? restore_fpregs_from_user+0xa9/0x150
> [ 1996.543139] seq_read+0xaa/0xe0
> [ 1996.543148] full_proxy_read+0x59/0x80
> [ 1996.543155] vfs_read+0xa1/0x1c0
> [ 1996.543164] ksys_read+0x63/0xe0
> [ 1996.543168] do_syscall_64+0x55/0x100
> [ 1996.543175] entry_SYSCALL_64_after_hwframe+0x78/0xe2
>
> The full soft lockup message is here:
> https://gist.github.com/FirstLoveLife/73f2185bed83a5faf7f94af8032a527b
>
> Signed-off-by: Li Chen <chenl311@...natelecom.cn>
> ---
> kernel/sched/debug.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
> index 9d71baf080751..9dd444c604a8b 100644
> --- a/kernel/sched/debug.c
> +++ b/kernel/sched/debug.c
> @@ -1065,6 +1065,7 @@ static int sched_debug_show(struct seq_file *m, void *v)
> else
> sched_debug_header(m);
>
> + cond_resched();
> return 0;
> }
>
> --
> 2.49.0
>
>
Regards,
Li
Powered by blists - more mailing lists