[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrXSsTa76QGg+MHDAkrr9gJPSqm2zKrcTbU5SV31mWDVuQ@mail.gmail.com>
Date:	Wed, 3 Dec 2014 16:38:46 -0800
From:	Andy Lutomirski <luto@...capital.net>
To:	Dave Jones <davej@...hat.com>,
	Andy Lutomirski <luto@...capital.net>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Linux Kernel <linux-kernel@...r.kernel.org>,
	Richard Guy Briggs <rgb@...hat.com>,
	Eric Paris <eparis@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Oleg Nesterov <oleg@...hat.com>,
	Paul McKenney <paulmck@...ux.vnet.ibm.com>
Subject: Re: [PATCH] context_tracking: Restore previous state in schedule_user
On Wed, Dec 3, 2014 at 4:30 PM, Dave Jones <davej@...hat.com> wrote:
> On Wed, Dec 03, 2014 at 04:04:31PM -0800, Andy Lutomirski wrote:
>  > On Wed, Dec 3, 2014 at 3:58 PM, Frederic Weisbecker <fweisbec@...il.com> wrote:
>  > > On Wed, Dec 03, 2014 at 03:18:41PM -0800, Andy Lutomirski wrote:
>  > >> It appears that some SCHEDULE_USER (asm for schedule_user) callers
>  > >> in arch/x86/kernel/entry_64.S are called from RCU kernel context,
>  > >> and schedule_user will return in RCU user context.  This causes RCU
>  > >> warnings and possible failures.
>  > >>
>  > >> This is intended to be a minimal fix suitable for 3.18.
>  > >>
>  > >> Reported-by: Dave Jones <davej@...hat.com>
>  > >> Cc: Oleg Nesterov <oleg@...hat.com>
>  > >> Cc: Frédéric Weisbecker <fweisbec@...il.com>
>  > >> Cc: Paul McKenney <paulmck@...ux.vnet.ibm.com>
>  > >> Signed-off-by: Andy Lutomirski <luto@...capital.net>
>  > >
>  > > Ah, we sent it about at the same time :-)
>  > >
>  > > Might be too late for 3.18 though because it's not a regression.
>
> Wait, so how come that trace didn't start showing up until recently ?
Looking at the code, it's because int_careful has the same bug, but
syscall_trace_leave does:
    /*
     * We may come here right after calling schedule_user()
     * or do_notify_resume(), in which case we can be in RCU
     * user mode.
     */
    user_exit();
which means that this issue was anticipated when that comment was written.
Prior to the 3.18 seccomp changes and the _TIF_WORK typo fix, it would
have been difficult to hit sysret_audit when context tracking was on
(you could do it once on the way out from a syscall that enabled
context tracking).  So this is 3.18 regression.
The sysret_audit code is still totally screwed up AFAICT.  At the very
least, the whole mess rather strongly suggests that, if both context
tracking and audit are on, then __audit_syscall_exit is called *twice*
on each syscall.  __audit_syscall_exit seems to be idempotent, so
maybe no one has noticed that little glitch.
I'll ask the x86 people to include my sysret_audit removal for 3.19,
since I think that this schedule_user change is a better last-minute
fix than removing a whole chunk of asm.
--Andy
>
>         Dave
>
-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Powered by blists - more mailing lists
 
