lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 11 Apr 2022 09:18:19 +0200
From:   Holger Hoffstätte <holger@...lied-asynchrony.com>
To:     Qais Yousef <qais.yousef@....com>,
        Greg KH <gregkh@...uxfoundation.org>
Cc:     linux-kernel@...r.kernel.org, linux-tip-commits@...r.kernel.org,
        Abhijeet Dharmapurikar <adharmap@...cinc.com>,
        Valentin Schneider <valentin.schneider@....com>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>,
        "Steven Rostedt (Google)" <rostedt@...dmis.org>, x86@...nel.org,
        stable@...r.kernel.org
Subject: Re: [tip: sched/core] sched/tracing: Don't re-read p->state when
 emitting sched_switch event

On 2022-04-11 01:22, Holger Hoffstätte wrote:
> On 2022-04-11 00:06, Qais Yousef wrote:
>> On 04/10/22 00:38, Qais Yousef wrote:
>>> On 03/08/22 18:51, Qais Yousef wrote:
>>>> On 03/08/22 19:10, Greg KH wrote:
>>>>> On Tue, Mar 08, 2022 at 06:02:40PM +0000, Qais Yousef wrote:
>>>>>> +CC stable
>>>>>>
>>>>>> On 03/01/22 15:24, tip-bot2 for Valentin Schneider wrote:
>>>>>>> The following commit has been merged into the sched/core branch of tip:
>>>>>>>
>>>>>>> Commit-ID:     fa2c3254d7cfff5f7a916ab928a562d1165f17bb
>>>>>>> Gitweb:        https://git.kernel.org/tip/fa2c3254d7cfff5f7a916ab928a562d1165f17bb
>>>>>>> Author:        Valentin Schneider <valentin.schneider@....com>
>>>>>>> AuthorDate:    Thu, 20 Jan 2022 16:25:19
>>>>>>> Committer:     Peter Zijlstra <peterz@...radead.org>
>>>>>>> CommitterDate: Tue, 01 Mar 2022 16:18:39 +01:00
>>>>>>>
>>>>>>> sched/tracing: Don't re-read p->state when emitting sched_switch event
>>>>>>>
>>>>>>> As of commit
>>>>>>>
>>>>>>>    c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p->on_cpu")
>>>>>>>
>>>>>>> the following sequence becomes possible:
>>>>>>>
>>>>>>>               p->__state = TASK_INTERRUPTIBLE;
>>>>>>>               __schedule()
>>>>>>>             deactivate_task(p);
>>>>>>>    ttwu()
>>>>>>>      READ !p->on_rq
>>>>>>>      p->__state=TASK_WAKING
>>>>>>>             trace_sched_switch()
>>>>>>>               __trace_sched_switch_state()
>>>>>>>                 task_state_index()
>>>>>>>                   return 0;
>>>>>>>
>>>>>>> TASK_WAKING isn't in TASK_REPORT, so the task appears as TASK_RUNNING in
>>>>>>> the trace event.
>>>>>>>
>>>>>>> Prevent this by pushing the value read from __schedule() down the trace
>>>>>>> event.
>>>>>>>
>>>>>>> Reported-by: Abhijeet Dharmapurikar <adharmap@...cinc.com>
>>>>>>> Signed-off-by: Valentin Schneider <valentin.schneider@....com>
>>>>>>> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
>>>>>>> Reviewed-by: Steven Rostedt (Google) <rostedt@...dmis.org>
>>>>>>> Link: https://lore.kernel.org/r/20220120162520.570782-2-valentin.schneider@arm.com
>>>>>>
>>>>>> Any objection to picking this for stable? I'm interested in this one for some
>>>>>> Android users but prefer if it can be taken by stable rather than backport it
>>>>>> individually.
>>>>>>
>>>>>> I think it makes sense to pick the next one in the series too.
>>>>>
>>>>> What commit does this fix in Linus's tree?
>>>>
>>>> It should be this one: c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p->on_cpu")
>>>
>>> Should this be okay to be picked up by stable now? I can see AUTOSEL has picked
>>> it up for v5.15+, but it impacts v5.10 too.
>>
>> commit: fa2c3254d7cfff5f7a916ab928a562d1165f17bb
>> subject: sched/tracing: Don't re-read p->state when emitting sched_switch event
>>
>> This patch has an impact on Android 5.10 users who experience tooling breakage.
>> Is it possible to include in 5.10 LTS please?
>>
>> It was already picked up for 5.15+ by AUTOSEL and only 5.10 is missing.
>>
> 
> https://lore.kernel.org/stable/Yk2PQzynOVOzJdPo@kroah.com/
> 
> However, since then further investigation (still in progress) has shown that this
> may have been the fault of the tool in question, so if you can verify that tracing
> sched still works for you with this patch in 5.15.x then by all means
> let's merge it.

So it turns out the lockup is indeed the fault of the tool, which contains multiple
kernel-version dependent tracepoint definitions and now fails with this
patch.

Greg, please re-enqueue this patch where necessary (5.10, 5.15+)
  
-h

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ