linux-kernel - Re: Fwd: WARNING: CPU: 13 PID: 3837105 at kernel/sched/sched.h:1561 __cfsb_csd

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <xm26cyz4ibnb.fsf@google.com>
Date:   Wed, 30 Aug 2023 12:16:24 -0700
From:   Benjamin Segall <bsegall@...gle.com>
To:     Bagas Sanjaya <bagasdotme@...il.com>
Cc:     Hao Jia <jiahao.os@...edance.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>,
        Igor Raits <igor.raits@...il.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Linux Regressions <regressions@...ts.linux.dev>,
        Linux Stable <stable@...r.kernel.org>
Subject: Re: Fwd: WARNING: CPU: 13 PID: 3837105 at kernel/sched/sched.h:1561
 __cfsb_csd_unthrottle+0x149/0x160

Bagas Sanjaya <bagasdotme@...il.com> writes:

> Hi,
>
> I notice a regression report on Bugzilla [1]. Quoting from it:
>
>> Hello, we recently got a few kernel crashes with following backtrace. Happened on 6.4.12 (and 6.4.11 I think) but did not happen (I think) on 6.4.4.
>> 
>> [293790.928007] ------------[ cut here ]------------
>> [293790.929905] rq->clock_update_flags & RQCF_ACT_SKIP
>> [293790.929919] WARNING: CPU: 13 PID: 3837105 at kernel/sched/sched.h:1561 __cfsb_csd_unthrottle+0x149/0x160
>> [293790.933694] Modules linked in: [...]
>> [293790.946262] Unloaded tainted modules: edac_mce_amd(E):1
>> [293790.956625] CPU: 13 PID: 3837105 Comm: QueryWorker-30f Tainted: G        W   E      6.4.12-1.gdc.el9.x86_64 #1
>> [293790.957963] Hardware name: RDO OpenStack Compute/RHEL, BIOS edk2-20230301gitf80f052277c8-2.el9 03/01/2023
>> [293790.959681] RIP: 0010:__cfsb_csd_unthrottle+0x149/0x160
>
> See Bugzilla for the full thread.
>
> Anyway, I'm adding this regression to regzbot:
>
> #regzbot introduced: ebb83d84e49b54 https://bugzilla.kernel.org/show_bug.cgi?id=217843
>
> Thanks.
>
> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=217843

The code in question is literally "rq_lock; update_rq_clock;
rq_clock_start_loop_update (the warning)", which suggests to me that
RQCF_ACT_SKIP is somehow leaking from somewhere else?