linux-kernel - Re: [PATCH] x86: Call fixup_exception() before notify_die() in math

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <DM5PR11MB201136DAFFB7005FA4E4DCB0B1700@DM5PR11MB2011.namprd11.prod.outlook.com>
Date:   Tue, 19 Jun 2018 19:56:24 +0000
From:   Siarhei Liakh <Siarhei.Liakh@...current-rt.com>
To:     Andy Lutomirski <luto@...capital.net>
CC:     Thomas Gleixner <tglx@...utronix.de>,
        Andy Lutomirski <luto@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>,
        Ingo Molnar <mingo@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>, Borislav Petkov <bpetkov@...e.de>
Subject: Re: [PATCH] x86: Call fixup_exception() before notify_die() in
 math_error()

On Tue, 19 Jun 2018, Andy Lutomirski wrote:   

> On Jun 19, 2018, at 9:15 AM, Siarhei Liakh <Siarhei.Liakh@...current-rt.com> wrote:
> 
> > On Mon, 18 Jun 2018, Andy Lutomirski wrote:
> > 
> > > > On Thu, Jun 14, 2018 at 10:10 PM Siarhei Liakh
> > > > <Siarhei.Liakh@...current-rt.com> wrote:
> > > > >
> > > > > fpu__drop() has an explicit fwait which under some conditions can trigger
> > > > > a fixable FPU exception while in kernel. Thus, we should attempt to fixup
> > > > > the exception first, and only call notify_die() if the fixup failed just
> > > > > like in do_general_protection(). The original call sequence incorrectly
> > > > > triggers KDB entry on debug kernels under particular FPU-intensive
> > > > > workloads. This issue had been privately observed, fixed, and tested
> > > > > on 4.9.98, while this patch brings the fix to the upstream.
> > > > 
> > > > Reviewed-by: Andy Lutomirski <luto@...nel.org>
> > > > 
> > > > With the caveat that you are perpetuating what is arguably a bug in
> > > > some of the other entries: math_error() can now be called with IRQs
> > > > off and return with IRQs on.  If we actually start asserting good
> > > > behavior in the entry code, we'll need to fix this.
> > > 
> > > Confused. math_error() is still invoked with interrupts off. What's
> > > different now is that notify_die() is called with interrupts conditionally
> > > enabled while upstream it's always called with interrupts disabled.
> > 
> > I see that notify_die() is being called either way in upstream (ex:
> > do_general_protection() and do_iret_error() vs do_bounds() and etc.).
> > Is there some some sort of general policy/guide documentation available
> > which outlines the expectations of notify_die(), as well as its notifiers?
> 
> I doubt it.
> 
> The right fix is to delete notify_die(), not to document it. kernel debuggers should
> hook die() directly, and other users (if any) should be moved into the error handlers.

Got it. Unfortunately, this looks like a whole separate code refactoring project
which I cannot undertake at this time. In the mean time, this patch offers a fix for
an immediate issue (KDB tripped when it shouldn't) even if it does nothing to
address the deficiencies in the framework itself. 

Thank you.