linux-kernel - Re: [PATCH] RFC: hung

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <7e4ef8c8-2def-5af9-f80e-b276fea8696a@i-love.sakura.ne.jp>
Date:   Fri, 3 May 2019 09:47:03 +0900
From:   Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
To:     Daniel Vetter <daniel.vetter@...ll.ch>,
        Intel Graphics Development <intel-gfx@...ts.freedesktop.org>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Daniel Vetter <daniel.vetter@...el.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        "Paul E. McKenney" <paulmck@...ux.ibm.com>,
        Valdis Kletnieks <valdis.kletnieks@...edu>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        "Liu, Chuansheng" <chuansheng.liu@...el.com>
Subject: Re: [PATCH] RFC: hung_task: taint kernel

On 2019/05/03 5:46, Daniel Vetter wrote:
> There's the hung_task_panic sysctl, but that's a bit an extreme measure.
> As a fallback taint at least the machine.
> 
> Our CI uses this to decide when a reboot is necessary, plus to figure
> out whether the kernel is still happy.

Why your CI can't watch for "blocked for more than" message instead of
setting the taint flag? How does your CI decide a reboot is necessary?

There is no need to set the tainted flag when some task was just blocked
for a while. It might be due to memory pressure, it might be due to setting
very short timeout (e.g. a few seconds), it might be due to busy CPUs doing
something else...