[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a09c1d4d-1d5b-9092-ae3a-61bc22689dd2@linux.com>
Date: Thu, 30 Sep 2021 18:05:54 +0300
From: Alexander Popov <alex.popov@...ux.com>
To: Petr Mladek <pmladek@...e.com>,
"Paul E. McKenney" <paulmck@...nel.org>
Cc: Jonathan Corbet <corbet@....net>,
Andrew Morton <akpm@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>,
Joerg Roedel <jroedel@...e.de>,
Maciej Rozycki <macro@...am.me.uk>,
Muchun Song <songmuchun@...edance.com>,
Viresh Kumar <viresh.kumar@...aro.org>,
Robin Murphy <robin.murphy@....com>,
Randy Dunlap <rdunlap@...radead.org>,
Lu Baolu <baolu.lu@...ux.intel.com>,
Kees Cook <keescook@...omium.org>,
Luis Chamberlain <mcgrof@...nel.org>, Wei Liu <wl@....org>,
John Ogness <john.ogness@...utronix.de>,
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
Alexey Kardashevskiy <aik@...abs.ru>,
Christophe Leroy <christophe.leroy@...roup.eu>,
Jann Horn <jannh@...gle.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Mark Rutland <mark.rutland@....com>,
Andy Lutomirski <luto@...nel.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Steven Rostedt <rostedt@...dmis.org>,
Will Deacon <will.deacon@....com>,
David S Miller <davem@...emloft.net>,
Borislav Petkov <bp@...en8.de>,
kernel-hardening@...ts.openwall.com,
linux-hardening@...r.kernel.org, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, notify@...nel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Dmitry Vyukov <dvyukov@...gle.com>
Subject: Re: [PATCH] Introduce the pkill_on_warn boot parameter
On 30.09.2021 12:15, Petr Mladek wrote:
> On Wed 2021-09-29 12:49:24, Paul E. McKenney wrote:
>> On Wed, Sep 29, 2021 at 10:01:33PM +0300, Alexander Popov wrote:
>>> On 29.09.2021 21:58, Alexander Popov wrote:
>>>> Currently, the Linux kernel provides two types of reaction to kernel
>>>> warnings:
>>>> 1. Do nothing (by default),
>>>> 2. Call panic() if panic_on_warn is set. That's a very strong reaction,
>>>> so panic_on_warn is usually disabled on production systems.
>
> Honestly, I am not sure if panic_on_warn() or the new pkill_on_warn()
> work as expected. I wonder who uses it in practice and what is
> the experience.
>
> The problem is that many developers do not know about this behavior.
> They use WARN() when they are lazy to write more useful message or when
> they want to see all the provided details: task, registry, backtrace.
>
> Also it is inconsistent with pr_warn() behavior. Why a single line
> warning would be innocent and full info WARN() cause panic/pkill?
>
> What about pr_err(), pr_crit(), pr_alert(), pr_emerg()? They inform
> about even more serious problems. Why a warning should cause panic/pkill
> while an alert message is just printed?
That's a good question.
I guess various kernel continuous integration (CI) systems have panic_on_warn
enabled.
[Adding Dmitry Vyukov to this discussion]
If we look at the syzbot dashboard [1] with the results of Linux kernel fuzzing,
we see the issues that appear as various kernel crashes and warnings.
We don't see anything from pr_err(), pr_crit(), pr_alert(), pr_emerg(). Maybe
these situations are not considered as kernel bugs that require fixing.
Anyway, from a security point of view, a kernel warning output is interesting
for attackers as an infoleak. The messages printed by pr_err(), pr_crit(),
pr_alert(), pr_emerg() provide less information.
[1]: https://syzkaller.appspot.com/upstream
> It somehow reminds me the saga with %pK. We were not able to teach
> developers to use it correctly for years and ended with hashed
> pointers.
>
> Well, this might be different. Developers might learn this the hard
> way from bug reports. But there will be bug reports only when
> anyone really enables this behavior. They will enable it only
> when it works the right way most of the time.
>
>
>>>> From a safety point of view, the Linux kernel misses a middle way of
>>>> handling kernel warnings:
>>>> - The kernel should stop the activity that provokes a warning,
>>>> - But the kernel should avoid complete denial of service.
>>>>
>>>> From a security point of view, kernel warning messages provide a lot of
>>>> useful information for attackers. Many GNU/Linux distributions allow
>>>> unprivileged users to read the kernel log, so attackers use kernel
>>>> warning infoleak in vulnerability exploits. See the examples:
>>>> https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
>>>> https://a13xp0p0v.github.io/2021/02/09/CVE-2021-26708.html
>>>>
>>>> Let's introduce the pkill_on_warn boot parameter.
>>>> If this parameter is set, the kernel kills all threads in a process
>>>> that provoked a kernel warning. This behavior is reasonable from a safety
>>>> point of view described above. It is also useful for kernel security
>>>> hardening because the system kills an exploit process that hits a
>>>> kernel warning.
>>>>
>>>> Signed-off-by: Alexander Popov <alex.popov@...ux.com>
>>>
>>> This patch was tested using CONFIG_LKDTM.
>>> The kernel kills a process that performs this:
>>> echo WARNING > /sys/kernel/debug/provoke-crash/DIRECT
>>>
>>> If you are fine with this approach, I will prepare a patch adding the
>>> pkill_on_warn sysctl.
>>
>> I suspect that you need a list of kthreads for which you are better
>> off just invoking panic(). RCU's various kthreads, for but one set
>> of examples.
>
> I wonder if kernel could survive killing of any kthread. I have never
> seen a code that would check whether a kthread was killed and
> restart it.
The do_group_exit() function calls do_exit() from kernel/exit.c, which is also
called during a kernel oops. This function cares about a lot of special cases
depending on the current task_struct. Is it fine?
Best regards,
Alexander
Powered by blists - more mailing lists