[<prev] [next>] [day] [month] [year] [list]
Message-ID: <DM5PR11MB201136DAFFB7005FA4E4DCB0B1700@DM5PR11MB2011.namprd11.prod.outlook.com>
Date: Tue, 19 Jun 2018 19:56:24 +0000
From: Siarhei Liakh <Siarhei.Liakh@...current-rt.com>
To: Andy Lutomirski <luto@...capital.net>
CC: Thomas Gleixner <tglx@...utronix.de>,
Andy Lutomirski <luto@...nel.org>,
LKML <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>, Borislav Petkov <bpetkov@...e.de>
Subject: Re: [PATCH] x86: Call fixup_exception() before notify_die() in
math_error()
On Tue, 19 Jun 2018, Andy Lutomirski wrote:
> On Jun 19, 2018, at 9:15 AM, Siarhei Liakh <Siarhei.Liakh@...current-rt.com> wrote:
>
> > On Mon, 18 Jun 2018, Andy Lutomirski wrote:
> >
> > > > On Thu, Jun 14, 2018 at 10:10 PM Siarhei Liakh
> > > > <Siarhei.Liakh@...current-rt.com> wrote:
> > > > >
> > > > > fpu__drop() has an explicit fwait which under some conditions can trigger
> > > > > a fixable FPU exception while in kernel. Thus, we should attempt to fixup
> > > > > the exception first, and only call notify_die() if the fixup failed just
> > > > > like in do_general_protection(). The original call sequence incorrectly
> > > > > triggers KDB entry on debug kernels under particular FPU-intensive
> > > > > workloads. This issue had been privately observed, fixed, and tested
> > > > > on 4.9.98, while this patch brings the fix to the upstream.
> > > >
> > > > Reviewed-by: Andy Lutomirski <luto@...nel.org>
> > > >
> > > > With the caveat that you are perpetuating what is arguably a bug in
> > > > some of the other entries: math_error() can now be called with IRQs
> > > > off and return with IRQs on. If we actually start asserting good
> > > > behavior in the entry code, we'll need to fix this.
> > >
> > > Confused. math_error() is still invoked with interrupts off. What's
> > > different now is that notify_die() is called with interrupts conditionally
> > > enabled while upstream it's always called with interrupts disabled.
> >
> > I see that notify_die() is being called either way in upstream (ex:
> > do_general_protection() and do_iret_error() vs do_bounds() and etc.).
> > Is there some some sort of general policy/guide documentation available
> > which outlines the expectations of notify_die(), as well as its notifiers?
>
> I doubt it.
>
> The right fix is to delete notify_die(), not to document it. kernel debuggers should
> hook die() directly, and other users (if any) should be moved into the error handlers.
Got it. Unfortunately, this looks like a whole separate code refactoring project
which I cannot undertake at this time. In the mean time, this patch offers a fix for
an immediate issue (KDB tripped when it shouldn't) even if it does nothing to
address the deficiencies in the framework itself.
Thank you.
Powered by blists - more mailing lists