linux-kernel - Re: BUG_ON() in workingset_node_shadows

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+55aFycvN=3DvsnRNpZbQ8z3893EK-nJA+V=Fx8o8yaviW7VA@mail.gmail.com>
Date:   Tue, 4 Oct 2016 20:29:00 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Paul Gortmaker <paul.gortmaker@...driver.com>
Cc:     Johannes Weiner <hannes@...xchg.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Antonio SJ Musumeci <trapexit@...wn.link>,
        Miklos Szeredi <miklos@...redi.hu>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        stable <stable@...r.kernel.org>
Subject: Re: BUG_ON() in workingset_node_shadows_dec() triggers

On Tue, Oct 4, 2016 at 7:43 PM, Paul Gortmaker
<paul.gortmaker@...driver.com> wrote:
>
> A couple years ago Ingo had an idea to kill  BUG_ON abuse that I made
> a 1st pass at.  Back then it seemed nobody cared.  Maybe that has since
> changed?
>
> https://lkml.org/lkml/2014/4/30/359

So we actually have the checkpatch warning already:

      # avoid BUG() or BUG_ON()
                if ($line =~ /\b(?:BUG|BUG_ON)\b/) {
                        my $msg_type = \&WARN;
                        $msg_type = \&CHK if ($file);
                        &{$msg_type}("AVOID_BUG",
                                     "Avoid crashing the kernel - try
using WARN_ON & recovery code rather than BUG() or BUG_ON()\n" .
$herecurr);
                }

but it doesn't trigger on VM_BUG_ON().

And I'm not convinced about replacing things with BUG_ON_AND_HALT(),
it simply doesn't fix the existing issue we have: people use BUG_ON(),
and worse, _when_ they use BUG_ON(), they use it instead of error
handling, so the code _around_ the BUG_ON() tends to then very much
depend on what the BUG_ON() checks.

This is actually one way that VM_BUG_ON() is better: it's very much by
design something that can be compiled away, so at least hopefully
nobody thinks of it as a security measure. So we could just say that
we will treat VM_BUG_ON() as a WARN_ON_ONCE(), and just not kill the
machine.

Because I could easily imagine that somebody *does* treat BUG_ON()
that way, thinking "well, if that BUG_ON() triggers at least it won't
then go off the rails later".

In fact, right now we mark BUG() in such a way that gcc can even take
advantage of the "crash the machine" semantics, because we explicitly
mark the BUG() with "unreachable()".

So what I think we should think about is:

 - extending the checkpatch warning to VM_BUG_ON too, to discourage new users.

 - look at making BUG_ON() simply be less lethal. Remove the
unrechable(), reorganize how the string is stored, and make it act
more like WARN_ON_ONCE() instead (it's the "rewind_stack_do_exit()"
that ends up causing us to try to kill things, we *could* just try to
stop doing that).

 - Instead of adding a BUG_ON_AND_HALT(), we could perhaps add a new
FATAL_ERROR() thing that acts like the current BUG_ON, and *not* call
it something similar (we don't want people doing mindless
conversions!). And that's the one that would do the whole
rewind_stack_do_exit() to kill the process.

But apart from the checkpatch thing, it's actually a pretty big change.

               Linus