[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150306075833.GA623@gmail.com>
Date: Fri, 6 Mar 2015 08:58:34 +0100
From: Ingo Molnar <mingo@...nel.org>
To: Oleg Nesterov <oleg@...hat.com>
Cc: Dave Hansen <dave.hansen@...el.com>, Borislav Petkov <bp@...e.de>,
Andy Lutomirski <luto@...capital.net>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Pekka Riikonen <priikone@....fi>,
Rik van Riel <riel@...hat.com>,
Suresh Siddha <sbsiddha@...il.com>,
LKML <linux-kernel@...r.kernel.org>,
"Yu, Fenghua" <fenghua.yu@...el.com>,
Quentin Casasnovas <quentin.casasnovas@...cle.com>
Subject: Re: [PATCH 1/1] x86/fpu: math_state_restore() should not blindly
disable irqs
* Oleg Nesterov <oleg@...hat.com> wrote:
> On 03/05, Ingo Molnar wrote:
> >
> > * Oleg Nesterov <oleg@...hat.com> wrote:
> >
> > > --- a/arch/x86/kernel/traps.c
> > > +++ b/arch/x86/kernel/traps.c
> > > @@ -774,7 +774,10 @@ void math_state_restore(void)
> > > struct task_struct *tsk = current;
> > >
> > > if (!tsk_used_math(tsk)) {
> > > - local_irq_enable();
> > > + bool disabled = irqs_disabled();
> > > +
> > > + if (disabled)
> > > + local_irq_enable();
> > > /*
> > > * does a slab alloc which can sleep
> > > */
> > > @@ -785,7 +788,9 @@ void math_state_restore(void)
> > > do_group_exit(SIGKILL);
> > > return;
> > > }
> > > - local_irq_disable();
> > > +
> > > + if (disabled)
> > > + local_irq_disable();
> > > }
> >
> > Yuck!
> >
> > Is there a fundamental reason why we cannot simply enable irqs and
> > leave them enabled? Math state restore is not atomic and cannot really
> > be atomic.
>
> You know, I didn't even try to verify ;) but see below.
So I'm thinking about the attached patch.
> Most probably we can simply enable irqs, yes. But what about older
> kernels, how can we check?
>
> And let me repeat, I strongly believe that this !tsk_used_math()
> case in math_state_restore() must die. And unlazy_fpu() in
> init_fpu(). And both __restore_xstate_sig() and flush_thread()
> should not use math_state_restore() at all. At least in its current
> form.
Agreed.
> But this is obviously not -stable material.
>
> That said, I'll try to look into git history tomorrow.
So I think the reasons are:
- historic: because math_state_restore() started out as an interrupt
routine (from the IRQ13 days)
- hardware imposed: the handler is executed with irqs off
- it's probably the fastest implementation: we just run with the
natural irqs-off state the handler executes with.
So there's nothing outright wrong about executing with irqs off in a
trap handler.
> [...] The patch above looks "obviously safe", but perhaps I am
> paranoid too much...
IMHO your hack above isn't really acceptable, even for a backport.
So lets test the patch below (assuming it's the right thing to do)
and move forward?
Thanks,
Ingo
======================>
From: Ingo Molnar <mingo@...nel.org>
Date: Fri, 6 Mar 2015 08:37:57 +0100
Subject: [PATCH] x86/fpu: Don't disable irqs in math_state_restore()
math_state_restore() was historically called with irqs disabled,
because that's how the hardware generates the trap, and also because
back in the days it was possible for it to be an asynchronous
interrupt and interrupt handlers run with irqs off.
These days it's always an instruction trap, and furthermore it does
inevitably complex things such as memory allocation and signal
processing, which is not done with irqs disabled.
So keep irqs enabled.
This might surprise in-kernel FPU users that somehow relied on
interrupts being disabled across FPU usage - but that's
fundamentally fragile anyway due to the inatomicity of FPU state
restores. The trap return will restore interrupts to its previous
state, but if FPU ops trigger math_state_restore() there's no
guarantee of atomicity anymore.
To warn about in-kernel irqs-off users of FPU state we might want to
pass 'struct pt_regs' to math_state_restore() and check the trapped
state for irqs disabled (flags has IF cleared) and kernel context -
but that's for a later patch.
Cc: Andy Lutomirski <luto@...capital.net>
Cc: Borislav Petkov <bp@...en8.de>
Cc: Fenghua Yu <fenghua.yu@...el.com>
Cc: H. Peter Anvin <hpa@...or.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Oleg Nesterov <oleg@...hat.com>
Cc: Quentin Casasnovas <quentin.casasnovas@...cle.com>
Cc: Thomas Gleixner <tglx@...utronix.de>
Signed-off-by: Ingo Molnar <mingo@...nel.org>
---
arch/x86/kernel/traps.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 950815a138e1..52f9e4057cee 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -844,8 +844,9 @@ void math_state_restore(void)
{
struct task_struct *tsk = current;
+ local_irq_enable();
+
if (!tsk_used_math(tsk)) {
- local_irq_enable();
/*
* does a slab alloc which can sleep
*/
@@ -856,7 +857,6 @@ void math_state_restore(void)
do_group_exit(SIGKILL);
return;
}
- local_irq_disable();
}
/* Avoid __kernel_fpu_begin() right after __thread_fpu_begin() */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists