lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 27 Dec 2018 13:46:17 -0800
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Sargun Dhillon <sargun@...gun.me>
Cc:     Vincent Guittot <vincent.guittot@...aro.org>,
        Xie XiuQi <xiexiuqi@...wei.com>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>, xiezhipeng1@...wei.com,
        huawei.libin@...wei.com,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Dmitry Adamushko <dmitry.adamushko@...il.com>,
        Tejun Heo <tj@...nel.org>
Subject: Re: [PATCH] sched: fix infinity loop in update_blocked_averages

On Thu, Dec 27, 2018 at 1:09 PM Sargun Dhillon <sargun@...gun.me> wrote:
>
> This appears to be broken since October on 4.18.5. We've only noticed
> it recently with a workload which does ridiculously parallel compiles
> in cgroups that are rapidly churned.

Yeah, that's probably unusual enough that people will have missed it.

Because it really looks like the bug has been there since 4.13, unless
I'm mis-reading things. Other things have changed there since, so
maybe I am.

> It's also an awkward bug to catch, because none of the lockup
> detectors, were catching it in our environment. The only reason we
> caught it was that it was blocking other cores, and those other cores
> were missing IPIs, resulting in catastrophic failure.

My gut feel is that we just need to revert that commit. It doesn't
revert clealy, but it doesn't look hard to do manually.

Something like the attached?

But we do need Tejun and PeterZ to take a look, since there might be
something subtle going on.

Everybody is probably still on well-deserved vacations, so it might be
a while. But testing the attached patch is probably a good idea
regardless.

                  Linus

View attachment "patch.diff" of type "text/x-patch" (2945 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ