lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 2 Oct 2021 14:41:34 +0300
From:   Alexander Popov <>
To:     Linus Torvalds <>,
        Petr Mladek <>
Cc:     "Paul E. McKenney" <>,
        Jonathan Corbet <>,
        Andrew Morton <>,
        Thomas Gleixner <>,
        Peter Zijlstra <>,
        Joerg Roedel <>,
        Maciej Rozycki <>,
        Muchun Song <>,
        Viresh Kumar <>,
        Robin Murphy <>,
        Randy Dunlap <>,
        Lu Baolu <>,
        Kees Cook <>,
        Luis Chamberlain <>, Wei Liu <>,
        John Ogness <>,
        Andy Shevchenko <>,
        Alexey Kardashevskiy <>,
        Christophe Leroy <>,
        Jann Horn <>,
        Greg Kroah-Hartman <>,
        Mark Rutland <>,
        Andy Lutomirski <>,
        Dave Hansen <>,
        Steven Rostedt <>,
        Will Deacon <>,
        David S Miller <>,
        Borislav Petkov <>,
        Kernel Hardening <>,,
        "open list:DOCUMENTATION" <>,
        Linux Kernel Mailing List <>,
Subject: Re: [PATCH] Introduce the pkill_on_warn boot parameter

On 01.10.2021 22:59, Linus Torvalds wrote:
> On Thu, Sep 30, 2021 at 2:15 AM Petr Mladek <> wrote:
>> Honestly, I am not sure if panic_on_warn() or the new pkill_on_warn()
>> work as expected. I wonder who uses it in practice and what is
>> the experience.
> Afaik, there are only two valid uses for panic-on-warn:
>  (a) test boxes (particularly VM's) that are literally running things
> like syzbot and want to report any kernel warnings
>  (b) the "interchangeable production machinery" fail-fast kind of situation
> So in that (a) case, it's literally that you consider a warning to be
> a failure case, and just want to stop. Very useful as a way to get
> notified by syzbot that "oh, that assert can actually trigger".
> And the (b) case is more of a "we have 150 million machines, we expect
> about a thousand of them to fail for any random reason any day
> _anyway_ - perhaps simply due to hardware failure, and we'd rather
> take a machine down quickly and then perhaps look at why only much
> later when we have some pattern to the failures".
> You shouldn't expect panic-on-warn to ever be the case for any actual
> production machine that _matters_. If it is, that production
> maintainer only has themselves to blame if they set that flag.
> But yes, the expectation is that warnings are for "this can't happen,
> but if it does, it's not necessarily fatal, I want to know about it so
> that I can think about it".
> So it might be a case that you don't handle, but that isn't
> necessarily _wrong_ to not handle. You are ok returning an error like
> -ENOSYS for that case, for example, but at the same time you are "If
> somebody uses this, we should perhaps react to it".
> In many cases, a "pr_warn()" is much better. But if you are unsure
> just _how_ the situation can happen, and want a call trace and
> information about what process did it, and it really is a "this
> shouldn't ever happen" situation, a WARN_ON() or a WARN_ON_ONCE() is
> certainly not wrong.
> So think of WARN_ON() as basically an assert, but an assert with the
> intention to be able to continue so that the assert can actually be
> reported. BUG_ON() and friends easily result in a machine that is
> dead. That's unacceptable.
> And think of "panic-on-warn" as people who can deal with their own
> problems. It's fundamentally not your issue.  They took that choice,
> it's their problem, and the security arguments are pure BS - because
> WARN_ON() just shouldn't be something you can trigger anyway.

Thanks, Linus.
And what do you think about the proposed pkill_on_warn?

Let me quote the rationale behind it.

Currently, the Linux kernel provides two types of reaction to kernel warnings:
 1. Do nothing (by default),
 2. Call panic() if panic_on_warn is set. That's a very strong reaction,
    so panic_on_warn is usually disabled on production systems.

>From a safety point of view, the Linux kernel misses a middle way of handling
kernel warnings:
 - The kernel should stop the activity that provokes a warning,
 - But the kernel should avoid complete denial of service.

>From a security point of view, kernel warning messages provide a lot of useful
information for attackers. Many GNU/Linux distributions allow unprivileged users
to read the kernel log (for various reasons), so attackers use kernel warning
infoleak in vulnerability exploits. See the examples:

Let's introduce the pkill_on_warn parameter.
If this parameter is set, the kernel kills all threads in a process that
provoked a kernel warning. This behavior is reasonable from a safety point of
view described above. It is also useful for kernel security hardening because
the system kills an exploit process that hits a kernel warning.

Linus, how do you see the proper way of handling WARN_ON() in kthreads if
pkill_on_warn is enabled?


Best regards,

Powered by blists - more mailing lists