linux-kernel - Re: [PATCH v2 0/2] Introduce the pkill_on

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <265cd2fd-9a9b-625d-a530-299bb7433edf@linux.com>
Date:   Tue, 16 Nov 2021 11:34:17 +0300
From:   Alexander Popov <alex.popov@...ux.com>
To:     Christophe Leroy <christophe.leroy@...roup.eu>,
        Steven Rostedt <rostedt@...dmis.org>,
        Lukas Bulwahn <lukas.bulwahn@...il.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Jonathan Corbet <corbet@....net>,
        Paul McKenney <paulmck@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Joerg Roedel <jroedel@...e.de>,
        Maciej Rozycki <macro@...am.me.uk>,
        Muchun Song <songmuchun@...edance.com>,
        Viresh Kumar <viresh.kumar@...aro.org>,
        Robin Murphy <robin.murphy@....com>,
        Randy Dunlap <rdunlap@...radead.org>,
        Lu Baolu <baolu.lu@...ux.intel.com>,
        Petr Mladek <pmladek@...e.com>,
        Kees Cook <keescook@...omium.org>,
        Luis Chamberlain <mcgrof@...nel.org>, Wei Liu <wl@....org>,
        John Ogness <john.ogness@...utronix.de>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Alexey Kardashevskiy <aik@...abs.ru>,
        Jann Horn <jannh@...gle.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Mark Rutland <mark.rutland@....com>,
        Andy Lutomirski <luto@...nel.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Will Deacon <will@...nel.org>,
        Ard Biesheuvel <ardb@...nel.org>,
        Laura Abbott <labbott@...nel.org>,
        David S Miller <davem@...emloft.net>,
        Borislav Petkov <bp@...en8.de>, Arnd Bergmann <arnd@...db.de>,
        Andrew Scull <ascull@...gle.com>,
        Marc Zyngier <maz@...nel.org>, Jessica Yu <jeyu@...nel.org>,
        Iurii Zaikin <yzaikin@...gle.com>,
        Rasmus Villemoes <linux@...musvillemoes.dk>,
        Wang Qing <wangqing@...o.com>, Mel Gorman <mgorman@...e.de>,
        Mauro Carvalho Chehab <mchehab+huawei@...nel.org>,
        Andrew Klychkov <andrew.a.klychkov@...il.com>,
        Mathieu Chouquet-Stringer <me@...hieu.digital>,
        Daniel Borkmann <daniel@...earbox.net>,
        Stephen Kitt <steve@....org>, Stephen Boyd <sboyd@...nel.org>,
        Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
        Mike Rapoport <rppt@...nel.org>,
        Bjorn Andersson <bjorn.andersson@...aro.org>,
        Kernel Hardening <kernel-hardening@...ts.openwall.com>,
        linux-hardening@...r.kernel.org,
        "open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
        linux-arch <linux-arch@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>, notify@...nel.org,
        main@...ts.elisa.tech, safety-architecture@...ts.elisa.tech,
        devel@...ts.elisa.tech, Shuah Khan <shuah@...nel.org>,
        Gabriele Paoloni <gpaoloni@...hat.com>,
        Robert Krutsch <krutsch@...il.com>
Subject: Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter

On 16.11.2021 09:37, Christophe Leroy wrote:
> Le 15/11/2021 à 17:06, Steven Rostedt a écrit :
>> On Mon, 15 Nov 2021 14:59:57 +0100
>> Lukas Bulwahn <lukas.bulwahn@...il.com> wrote:
>>
>>> 1. Allow a reasonably configured kernel to boot and run with
>>> panic_on_warn set. Warnings should only be raised when something is
>>> not configured as the developers expect it or the kernel is put into a
>>> state that generally is _unexpected_ and has been exposed little to
>>> the critical thought of the developer, to testing efforts and use in
>>> other systems in the wild. Warnings should not be used for something
>>> informative, which still allows the kernel to continue running in a
>>> proper way in a generally expected environment. Up to my knowledge,
>>> there are some kernels in production that run with panic_on_warn; so,
>>> IMHO, this requirement is generally accepted (we might of course
>>
>> To me, WARN*() is the same as BUG*(). If it gets hit, it's a bug in the
>> kernel and needs to be fixed. I have several WARN*() calls in my code, and
>> it's all because the algorithms used is expected to prevent the condition
>> in the warning from happening. If the warning triggers, it means either that
>> the algorithm is wrong or my assumption about the algorithm is wrong. In
>> either case, the kernel needs to be updated. All my tests fail if a WARN*()
>> gets hit (anywhere in the kernel, not just my own).
>>
>> After reading all the replies and thinking about this more, I find the
>> pkill_on_warning actually worse than not doing anything. If you are
>> concerned about exploits from warnings, the only real solution is a
>> panic_on_warning. Yes, it brings down the system, but really, it has to be
>> brought down anyway, because it is in need of a kernel update.
>>
> 
> We also have LIVEPATCH to avoid bringing down the system for a kernel
> update, don't we ? So I wouldn't expect bringing down a vital system
> just for a WARN.

Hello Christophe,

I would say that different systems have different requirements.
Not every Linux-based system needs live patching (it also has own limitations).

That's why I proposed a sysctl and didn't change the default kernel behavior.

> As far as I understand from
> https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on,
> WARN() and WARN_ON() are meant to deal with those situations as
> gracefull as possible, allowing the system to continue running the best
> it can until a human controled action is taken.

I can't agree here. There is a very strong push against adding BUG*() to the 
kernel source code. So there are a lot of cases when WARN*() is used for severe 
problems because kernel developers just don't have other options.

Currently, it looks like there is no consistent error handling policy in the kernel.

> So I'd expect the WARN/WARN_ON to be handled and I agree that that
> pkill_on_warning seems dangerous and unrelevant, probably more dangerous
> than doing nothing, especially as the WARN may trigger for a reason
> which has nothing to do with the running thread.

Sorry, I see a contradiction.
If killing a process hitting a kernel warning is "dangerous and unrelevant",
why killing a process on a kernel oops is fine? That's strange.

Linus calls that behavior "fairly benign" here: 
http://lkml.iu.edu/hypermail/linux/kernel/1610.0/01217.html

Best regards,
Alexander