[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190729221506.1aed7dfc@oasis.local.home>
Date: Mon, 29 Jul 2019 22:15:06 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Eiichi Tsukata <devel@...ukata.com>
Cc: joel@...lfernandes.org, paulmck@...ux.vnet.ibm.com,
tglx@...utronix.de, peterz@...radead.org, mingo@...hat.com,
fweisbec@...il.com, luto@...capital.net,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] tracing: Prevent RCU EQS breakage in preemptirq events
On Tue, 30 Jul 2019 11:00:42 +0900
Eiichi Tsukata <devel@...ukata.com> wrote:
> Thanks for comments.
>
> On 2019/07/30 0:21, Steven Rostedt wrote:
> > On Mon, 29 Jul 2019 10:07:34 +0900
> > Eiichi Tsukata <devel@...ukata.com> wrote:
> >
> >> If context tracking is enabled, causing page fault in preemptirq
> >> irq_enable or irq_disable events triggers the following RCU EQS warning.
> >>
> >> Reproducer:
> >>
> >> // CONFIG_PREEMPTIRQ_EVENTS=y
> >> // CONFIG_CONTEXT_TRACKING=y
> >> // CONFIG_RCU_EQS_DEBUG=y
> >> # echo 1 > events/preemptirq/irq_disable/enable
> >> # echo 1 > options/userstacktrace
> >
> > So the problem is only with userstacktrace enabled?
>
> It can happen when tracing code causes page fault in preemptirq events.
> For example, the following perf command also hit the warning:
>
> # perf record -e 'preemptirq:irq_enable' -g ls
Again,
That's not a irq trace event issue, that's a stack trace issue.
>
>
> >>
> >> __visible void trace_hardirqs_on_caller(unsigned long caller_addr)
> >> {
> >> + enum ctx_state prev_state;
> >> +
> >> if (this_cpu_read(tracing_irq_cpu)) {
> >> - if (!in_nmi())
> >> + if (!in_nmi()) {
> >
> > This is a very high fast path (for tracing irqs off and such). Instead
> > of adding a check here for a case that is seldom used (userstacktrace
> > and tracing irqs on/off). Move this to surround the userstack trace
> > code.
> >
> > -- Steve
>
> If the problem was only with userstacktrace, it will be reasonable to
> surround only the userstack unwinder. But the situation is similar to
> the previous "tracing vs CR2" case. As Peter taught me in
> https://lore.kernel.org/lkml/20190708074823.GV3402@hirez.programming.kicks-ass.net/
> there are some other codes likely to to user access.
> So I surround preemptirq events earlier.
I disagree. The issue is with the attached callbacks that call
something (a stack unwinder) that can fault.
This is called whenever irqs are disabled. I say we surround the
culprit (the stack unwinder or stack trace) and not punish the irq
enable/disable events.
So NAK on this patch.
-- Steve
Powered by blists - more mailing lists