lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+G9fYv-h1ONWq232S0BxpdWy52ysPs1VPzKV6ZypSV6B_VSpQ@mail.gmail.com>
Date: Wed, 17 Dec 2025 00:41:39 +0530
From: Naresh Kamboju <naresh.kamboju@...aro.org>
To: Mark Rutland <mark.rutland@....com>
Cc: sched-ext@...ts.linux.dev, open list <linux-kernel@...r.kernel.org>, 
	lkft-triage@...ts.linaro.org, Peter Zijlstra <peterz@...radead.org>, 
	Ingo Molnar <mingo@...nel.org>, Vincent Guittot <vincent.guittot@...aro.org>, 
	Juri Lelli <juri.lelli@...hat.com>, Dietmar Eggemann <dietmar.eggemann@....com>, 
	Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>, 
	Valentin Schneider <vschneid@...hat.com>, Tejun Heo <tj@...nel.org>, void@...ifault.com, 
	arighi@...dia.com, changwoo@...lia.com
Subject: Re: Boot regression: arm64: WARNING: kernel/sched/core.c:10851 at sched_change_end

Hi Mark,

On Tue, 16 Dec 2025 at 22:53, Mark Rutland <mark.rutland@....com> wrote:
>
> On Tue, Dec 16, 2025 at 05:09:52PM +0530, Naresh Kamboju wrote:
> > The following boot warning is noticed on qemu-arm64 devices booting
> > Linux next-20251215 on wards.
>
> I didn't realise LKFT was operating from a hospital!

:) No hospital ward involved :)  “onwards” it is. Thanks for spotting that.

Thank you for taking the time to review the report and for the detailed
feedback. I appreciate you pointing out the issues with ordering,
formatting, and accuracy. This is very helpful.

> > Regression Analysis:
> > - New regression? yes
> > - Reproducibility? yes
> >
> > First seen on next-20251215
> > Bad:  next-20251215 and next-20251216
> > Good: next-20251212
> >
> > Boot regression: arm64: WARNING: kernel/sched/core.c:10851 at sched_change_end
>
> This email is really painful to read, because the information is
> out-of-order and has random words added to obscure relevant information.

This one’s on me. I used Gmail instead of git send-email/mutt for the
kernel logs.
I’ll use git send-email or mutt for regression reporting going forward.

> The warning you quote should be the *FIRST* line after you mention "The
> following boot warning".
>
> That should be quoted *exactly* as the kernel logged it, without being
> prefixed by "Boot regression: arm64:", which the kernel didn't log, and
> which is redundant given the title and surrounding context.

The additional prefixes were introduced for internal statistics
tracking and report classification purposes. However, I acknowledge
that they reduced clarity in this case.

For example,
Build regression:
Boot regression:
Test regression:

Any suggestions for a regression classification identifier for reporting ?

>
> It'd be *significantly* clearer to have:
>
> | I'm seeing the following warning consistently on qemu-arm64 when
> | booting next-20251215 and later:
> |
> |   WARNING: kernel/sched/core.c:10851 at sched_change_end
> |
> | Note: full splat at the end of this mail.
> |
> | First seen on next-20251215
> | Bad:  next-20251215 and next-20251216
> | Good: next-20251212
> |
> | This looks to be a regression.

Noted.

>
> ... where someone can see all the relevant details at a glance.
>
> > Reported-by: Linux Kernel Functional Testing <lkft@...aro.org>

<trim>

> This is all formatted illegibly since it was line-wrapped, and you
> didn't mention why you're dumping this. AFAICT the only relevant bit is
> that the warning is from:
>
>         WARN_ON_ONCE(sched_class_above(ctx->class, p->sched_class) &&
>                      !test_tsk_need_resched(p));
>
> ... which Peter added in commit:
>
>   47efe2ddccb1f ("sched/core: Add assertions to QUEUE_CLASS")
>
> Have you tried looking around that commit, or bisecting?

My point was simply to share the git blame details for the recent change.

>
> [...]
>
> Note: I've fixed the line-wrapping for the below. Please fix that in
> future.

I’ll make sure to follow your suggested regression reporting format
going forward.

>
> > [   14.696414] ------------[ cut here ]------------
> > [   14.696418] WARNING: kernel/sched/core.c:10851 at sched_change_end+0x168/0x188, CPU#12: ktimers/12/117
> > [   14.729321] Modules linked in: cppc_cpufreq(+) arm_dsu_pmu(+) fuse drm backlight
> > [   14.736718] CPU: 12 UID: 0 PID: 117 Comm: ktimers/12 Not tainted 6.19.0-rc1-next-20251216 #1 PREEMPT_RT
> > [   14.746190] Hardware name: WIWYNN Mt.Jade Server System B81.030Z1.0010/Mt.Jade Motherboard, BIOS 2.10.20250506-1P (SCP: 2.10.20250506) 2025/05/06
>
> This doesn't look like a "qemu-arm64 device" to me. Are you *sure* this
> wasn't bare-metal on a "WIWYNN Mt.Jade Server System"?
>
> If not, why is QEMU passing that gunk to the guest!?

Apologies for the confusion regarding the platform identification.
This warning was reproduced on both qemu-arm64 and Mt. Jade.
The log is from the Mt. Jade system.

>
> Mark.

- Naresh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ