[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20141005202300.GA27962@redhat.com>
Date: Sun, 5 Oct 2014 22:23:00 +0200
From: Oleg Nesterov <oleg@...hat.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>,
Frederic Weisbecker <fweisbec@...il.com>,
Steven Rostedt <rostedt@...dmis.org>
Cc: Sasha Levin <sasha.levin@...cle.com>,
Ingo Molnar <mingo@...nel.org>, Peter Anvin <hpa@...or.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
Andy Lutomirski <luto@...capital.net>,
Denys Vlasenko <dvlasenk@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Chuck Ebbert <cebbert.lkml@...il.com>
Subject: [PATCH 0/1] stop the unbound recursion in
preempt_schedule_context()
On 10/04, Oleg Nesterov wrote:
>
> On 10/03, Linus Torvalds wrote:
> >
> > On Fri, Oct 3, 2014 at 5:01 PM, Linus Torvalds
> > <torvalds@...ux-foundation.org> wrote:
> > >
> > > The real fix would appear to be to use
> > > "preempt_enable_no_resched_notrace()", which your patch did, but
> > > without the loop.
> >
> > Actually, the real fix would be to not be stupid, and just make the
> > code do something like
> >
> > > if (likely(!preemptible()))
> > > return;
> > >
> > > __preempt_count_add(PREEMPT_ACTIVE);
> > > prev_ctx = exception_enter();
> > >
> > > __schedule();
> > >
> > > exception_exit(prev_ctx);
> > > __preempt_count_sub(PREEMPT_ACTIVE);
> >
> > and *not* enable preemption around the scheduling at all.
Yes, I think you are right. And I hate to admit that I didn't think about
this simplification. The only complication is that we should move this
function from context_tracking.c to sched/core.c.
However, I still think we need the need_resched() loop, we can't do this
only once.
And in fact the simplest fix could just turn it into
local_irq_disable();
preempt_schedule_irq();
local_irq_enable();
> Again, it is too late for me... Most probably I am wrong, but somehow
> it seems to me that the real fix should try to kill preempt_schedule_context()
> altogether and teach preempt_schedule() to play well with CONTEXT_TRACKING.
Yes, the very fact that
preempt_enable() != trace_preempt_on() + preempt_enable_notrace()
looks simply wrong imo. And preempt_enable_notrace() has users outside of
the tracing code which can run in IN_USER state. For example, why should
__perf_sw_event() worry about context_tracking.state?
OTOH, if the caller of preempt_enable_notrace() actually needs to take care
of potential IN_USER state, then it probably has other problems. And indeed,
ftrace_ops_control_func() has to check rcu_is_watching() anyway.
IOW, imho 29bb9e5a75 "tracing/context-tracking: Add preempt_schedule_context()
for tracing" was not a right solution. Perhaps the tracing functions can be
changed to switch to IN_KERNEL mode, or at least perhaps we can add the
special preempt_disable_ftrace() helper. But this needs another discussion.
Either way, I do think that preempt_schedule_context() in its current form
must die.
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists