[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170802124239.GD1919@mai>
Date: Wed, 2 Aug 2017 14:42:39 +0200
From: Daniel Lezcano <daniel.lezcano@...aro.org>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: paulmck@...ux.vnet.ibm.com, john.stultz@...aro.org,
linux-kernel@...r.kernel.org
Subject: Re: RCU stall when using function_graph
On Tue, Aug 01, 2017 at 08:12:14PM -0400, Steven Rostedt wrote:
> On Wed, 2 Aug 2017 00:15:44 +0200
> Daniel Lezcano <daniel.lezcano@...aro.org> wrote:
>
> > On 02/08/2017 00:04, Paul E. McKenney wrote:
> > >> Hi Paul,
> > >>
> > >> I have been trying to set the function_graph tracer for ftrace and each time I
> > >> get a CPU stall.
> > >>
> > >> How to reproduce:
> > >> -----------------
> > >>
> > >> echo function_graph > /sys/kernel/debug/tracing/current_tracer
> > >>
> > >> This error appears with v4.13-rc3 and v4.12-rc6.
>
> Can you bisect this? It may be due to this commit:
>
> 0598e4f08 ("ftrace: Add use of synchronize_rcu_tasks() with dynamic trampolines")
Hi Steve,
I git bisected but each time the issue occured. I went through the different
version down to v4.4 where the board was not fully supported and it ended up to
have the same issue.
Finally, I had the intuition it could be related to the wall time (there is no
RTC clock with battery on the board and the wall time is Jan 1st, 1970).
Setting up the with ntpdate solved the problem.
Even if it is rarely the case to have the time not set, is it normal to have a
RCU cpu stall ?
> > >>
> > >> Is it something already reported ?
> > >
> > > I have seen this sort of thing, but only when actually dumping the trace
> > > out, and I though those got fixed. You are seeing this just accumulating
> > > the trace?
> >
> > No, just by changing the tracer. It is the first operation I do after
> > rebooting and it is reproducible each time. That happens on an ARM64
> > platform.
> >
> > > These RCU CPU stall warnings usually occur when something grabs hold of
> > > a CPU for too long, as in 21 seconds or so. One way that they can happen
> > > is excessive lock contention, another is having the kernel run through
> > > too much data at one shot.
> > >
> > > Adding Steven Rostedt on CC for his thoughts.
> > >
> >
--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
Powered by blists - more mailing lists