linux-kernel - Re: [PATCH] mm/tlb: Fix use

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <202002211506.2151CA26@keescook>
Date:   Fri, 21 Feb 2020 15:10:48 -0800
From:   Kees Cook <keescook@...omium.org>
To:     Andy Lutomirski <luto@...capital.net>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Jens Axboe <axboe@...nel.dk>, Jann Horn <jannh@...gle.com>,
        Will Deacon <will@...nel.org>
Subject: Re: [PATCH] mm/tlb: Fix use_mm() vs TLB invalidate

On Fri, Feb 21, 2020 at 11:22:16AM -0800, Andy Lutomirski wrote:
> 
> 
> > On Feb 21, 2020, at 11:19 AM, Linus Torvalds <torvalds@...ux-foundation.org> wrote:
> > 
> > On Fri, Feb 21, 2020 at 3:11 AM Peter Zijlstra <peterz@...radead.org> wrote:
> >> 
> >> +       BUG_ON(!(tsk->flags & PF_KTHREAD));
> >> +       BUG_ON(tsk->mm != NULL);
> > 
> > Stop this craziness.
> > 
> > There is absolutely ZERO excuse for this kind of garbage.
> > 
> > Making this a BUG_ON() will just cause all the possible debugging info
> > to be thrown away and lost, and you often have a dead machine.
> > 
> > For absolutely no good reason.
> > 
> > Make it a WARN_ON_ONCE(). If it triggers, everything works the way it
> > always did, but we get notified.
> > 
> > Stop with the stupid crazy BUG_ON() crap already. It is actively _bad_
> > for debugging.
> > 
> >  
> 
> In this particular case, if we actually flub this, we are very likely to cause data corruption — we’re about to do user access with the wrong mm.
> 
> So I suppose we could switch to init_mm and carry on. *Something* will crash, but it probably won’t corrupt data or take down the machine.

Why not just fail after the WARN -- I wrote the patch for the (very few)
callers to handle the errors, clean up, and carry on.

-- 
Kees Cook