linux-hardening - Re: [PATCH] Introduce the pkill_on

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c48643fa-213d-7391-9f5d-c75efe709c3f@linux.com>
Date:   Wed, 6 Oct 2021 17:56:55 +0300
From:   Alexander Popov <alex.popov@...ux.com>
To:     "Eric W. Biederman" <ebiederm@...ssion.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Petr Mladek <pmladek@...e.com>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Jonathan Corbet <corbet@....net>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Joerg Roedel <jroedel@...e.de>,
        Maciej Rozycki <macro@...am.me.uk>,
        Muchun Song <songmuchun@...edance.com>,
        Viresh Kumar <viresh.kumar@...aro.org>,
        Robin Murphy <robin.murphy@....com>,
        Randy Dunlap <rdunlap@...radead.org>,
        Lu Baolu <baolu.lu@...ux.intel.com>,
        Kees Cook <keescook@...omium.org>,
        Luis Chamberlain <mcgrof@...nel.org>, Wei Liu <wl@....org>,
        John Ogness <john.ogness@...utronix.de>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Alexey Kardashevskiy <aik@...abs.ru>,
        Christophe Leroy <christophe.leroy@...roup.eu>,
        Jann Horn <jannh@...gle.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Mark Rutland <mark.rutland@....com>,
        Andy Lutomirski <luto@...nel.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Will Deacon <will.deacon@....com>,
        David S Miller <davem@...emloft.net>,
        Borislav Petkov <bp@...en8.de>,
        Kernel Hardening <kernel-hardening@...ts.openwall.com>,
        linux-hardening@...r.kernel.org,
        "open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        notify@...nel.org
Subject: Re: [PATCH] Introduce the pkill_on_warn boot parameter

On 05.10.2021 22:48, Eric W. Biederman wrote:
> Alexander Popov <alex.popov@...ux.com> writes:
> 
>> On 02.10.2021 19:52, Linus Torvalds wrote:
>>> On Sat, Oct 2, 2021 at 4:41 AM Alexander Popov <alex.popov@...ux.com> wrote:
>>>>
>>>> And what do you think about the proposed pkill_on_warn?
>>>
>>> Honestly, I don't see the point.
>>>
>>> If you can reliably trigger the WARN_ON some way, you can probably
>>> cause more problems by fooling some other process to trigger it.
>>>
>>> And if it's unintentional, then what does the signal help?
>>>
>>> So rather than a "rationale" that makes little sense, I'd like to hear
>>> of an actual _use_ case. That's different. That's somebody actually
>>> _using_ that pkill to good effect for some particular load.
>>
>> I was thinking about a use case for you and got an insight.
>>
>> Bugs usually don't come alone. Killing the process that got WARN_ON() prevents
>> possible bad effects **after** the warning. For example, in my exploit for
>> CVE-2019-18683, the kernel warning happens **before** the memory corruption
>> (use-after-free in the V4L2 subsystem).
>> https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
>>
>> So pkill_on_warn allows the kernel to stop the process when the first signs of
>> wrong behavior are detected. In other words, proceeding with the code execution
>> from the wrong state can bring more disasters later.
>>
>>> That said, I don't much care in the end. But it sounds like a
>>> pointless option to just introduce yet another behavior to something
>>> that should never happen anyway, and where the actual
>>> honest-to-goodness reason for WARN_ON() existing is already being
>>> fulfilled (ie syzbot has been very effective at flushing things like
>>> that out).
>>
>> Yes, we slowly get rid of kernel warnings.
>> However, the syzbot dashboard still shows a lot of them.
>> Even my small syzkaller setup finds plenty of new warnings.
>> I believe fixing all of them will take some time.
>> And during that time, pkill_on_warn may be a better reaction to WARN_ON() than
>> ignoring and proceeding with the execution.
>>
>> Is that reasonable?
> 
> I won't comment on the sanity of the feature but I will say that calling
> it oops_on_warn (rather than pkill_on_warn), and using the usual oops
> facilities rather than rolling oops by hand sounds like a better
> implementation.
> 
> Especially as calling do_group_exit(SIGKILL) from a random location is
> not a clean way to kill a process.  Strictly speaking it is not even
> killing the process.
> 
> Partly this is just me seeing the introduction of a
> do_group_exit(SIGKILL) call and not likely the maintenance that will be
> needed.  I am still sorting out the problems with other randomly placed
> calls to do_group_exit(SIGKILL) and interactions with ptrace and
> PTRACE_EVENT_EXIT in particular.
> 
> Which is a long winded way of saying if I can predictably trigger a
> warning that calls do_group_exit(SIGKILL), on some architectures I can
> use ptrace and  can convert that warning into a way to manipulate the
> kernel stack to have the contents of my choice.
> 
> If anyone goes forward with this please use the existing oops
> infrastructure so the ptrace interactions and anything else that comes
> up only needs to be fixed once.

Eric, thanks a lot.

I will learn the oops infrastructure deeper.
I will do more experiments and come with version 2.

Currently, I think I will save the pkill_on_warn option name because I want to
avoid kernel crashes.

Thanks to everyone who gave feedback on this patch!

Best regards,
Alexander