[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGXu5jJ9sAXauDMeW262qX_42TS2gmJBsR1yq2XDeHzn+54PoA@mail.gmail.com>
Date: Wed, 5 Oct 2016 15:17:02 -0700
From: Kees Cook <keescook@...omium.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Willy Tarreau <w@....eu>,
Paul Gortmaker <paul.gortmaker@...driver.com>,
Johannes Weiner <hannes@...xchg.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Antonio SJ Musumeci <trapexit@...wn.link>,
Miklos Szeredi <miklos@...redi.hu>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
stable <stable@...r.kernel.org>
Subject: Re: BUG_ON() in workingset_node_shadows_dec() triggers
On Wed, Oct 5, 2016 at 2:46 PM, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
> On Wed, Oct 5, 2016 at 2:14 PM, Kees Cook <keescook@...omium.org> wrote:
>> Now, it can be argued that killing the process part should be
>> configurable and that the code should be written to handle a WARN and
>> clean up and error out nicely. But I still want to retain the "kill
>> the process immediately" behavior in some capacity.
>
> If "some capacity" is "can't do user space accesses", we could easily
> force a SIGKILL of the current process. It won't die immediately in
> the kernel, but it won't be returning to user space either.
With my more paranoid desires, I would prefer to keep "stop kernel
execution with the state set up by this process", not just "make the
process never return to user-space". I would need to meditate on
whether what I really want is just "panic on Oops" or not, though.
Right now, for example, I don't use panic-on-oops when running lkdtm
tests since each test gets (correctly) killed and the Oops can be
examined for the expected failure mode, all without bringing down the
entire system.
> The problem with the immediate kill is that it can be in interrupt
> context, or just holding arbitrary locks. And it's hard to even tell
> dynamically (sometimes you can see it: with preemption enabled you can
> tell "am I in a non-preempt area", for example, but it ends up
> depending on config options).
Yeah, I've seen some hilarious failure modes while building lkdtm
tests for various kernel self-protections.
> And *if* we make BUG() actually do something sane (non-trapping), we
> can easily make it be generic, not arch-specific. In fact, I'd
> implement it by just adding a "handle_bug()" in kernel/panic.c...
Yeah, I'm not sure what the right next step would be. Do we need a new
set of functions between WARN and BUG? Or maybe extract the
process-killing logic on a per-arch level and make it a specific API
so that it can be explicitly called as part of error-handling? Hmm
-Kees
--
Kees Cook
Nexus Security
Powered by blists - more mailing lists