[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFycvN=3DvsnRNpZbQ8z3893EK-nJA+V=Fx8o8yaviW7VA@mail.gmail.com>
Date: Tue, 4 Oct 2016 20:29:00 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Paul Gortmaker <paul.gortmaker@...driver.com>
Cc: Johannes Weiner <hannes@...xchg.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Antonio SJ Musumeci <trapexit@...wn.link>,
Miklos Szeredi <miklos@...redi.hu>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
stable <stable@...r.kernel.org>
Subject: Re: BUG_ON() in workingset_node_shadows_dec() triggers
On Tue, Oct 4, 2016 at 7:43 PM, Paul Gortmaker
<paul.gortmaker@...driver.com> wrote:
>
> A couple years ago Ingo had an idea to kill BUG_ON abuse that I made
> a 1st pass at. Back then it seemed nobody cared. Maybe that has since
> changed?
>
> https://lkml.org/lkml/2014/4/30/359
So we actually have the checkpatch warning already:
# avoid BUG() or BUG_ON()
if ($line =~ /\b(?:BUG|BUG_ON)\b/) {
my $msg_type = \&WARN;
$msg_type = \&CHK if ($file);
&{$msg_type}("AVOID_BUG",
"Avoid crashing the kernel - try
using WARN_ON & recovery code rather than BUG() or BUG_ON()\n" .
$herecurr);
}
but it doesn't trigger on VM_BUG_ON().
And I'm not convinced about replacing things with BUG_ON_AND_HALT(),
it simply doesn't fix the existing issue we have: people use BUG_ON(),
and worse, _when_ they use BUG_ON(), they use it instead of error
handling, so the code _around_ the BUG_ON() tends to then very much
depend on what the BUG_ON() checks.
This is actually one way that VM_BUG_ON() is better: it's very much by
design something that can be compiled away, so at least hopefully
nobody thinks of it as a security measure. So we could just say that
we will treat VM_BUG_ON() as a WARN_ON_ONCE(), and just not kill the
machine.
Because I could easily imagine that somebody *does* treat BUG_ON()
that way, thinking "well, if that BUG_ON() triggers at least it won't
then go off the rails later".
In fact, right now we mark BUG() in such a way that gcc can even take
advantage of the "crash the machine" semantics, because we explicitly
mark the BUG() with "unreachable()".
So what I think we should think about is:
- extending the checkpatch warning to VM_BUG_ON too, to discourage new users.
- look at making BUG_ON() simply be less lethal. Remove the
unrechable(), reorganize how the string is stored, and make it act
more like WARN_ON_ONCE() instead (it's the "rewind_stack_do_exit()"
that ends up causing us to try to kill things, we *could* just try to
stop doing that).
- Instead of adding a BUG_ON_AND_HALT(), we could perhaps add a new
FATAL_ERROR() thing that acts like the current BUG_ON, and *not* call
it something similar (we don't want people doing mindless
conversions!). And that's the one that would do the whole
rewind_stack_do_exit() to kill the process.
But apart from the checkpatch thing, it's actually a pretty big change.
Linus
Powered by blists - more mailing lists