[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e9375c7c-745d-fac3-6e16-539712ceaaea@redhat.com>
Date: Fri, 14 Dec 2018 11:21:33 +0100
From: Daniel Bristot de Oliveira <bristot@...hat.com>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
Clark Williams <williams@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Masami Hiramatsu <mhiramat@...nel.org>,
linux-rt-users <linux-rt-users@...r.kernel.org>,
Marko Pusch <marko.pusch@...mens.com>,
Tommaso Cucinotta <tommaso.cucinotta@...up.it>,
RĂ´mulo Silva de Oliveira
<romulo.deoliveira@...c.br>, Ingo Molnar <mingo@...hat.com>
Subject: Re: BUG: ftrace/perf dropping events at the begin of interrupt
handlers
On 12/4/18 8:16 PM, Steven Rostedt wrote:
> Yes, it's a simple fix. The problem is that the recursion detection of
> the function tracer requires that when its called from interrupt, the
> "in_interrupt" needs to be true, otherwise it thinks that the function
> tracer is recursing on itself (which is common).
>
> Looking an the dropped events, and the code in __irq_enter() we have
> this:
>
> #define __irq_enter() \
> do { \
> account_irq_enter_time(current); \
> preempt_count_add(HARDIRQ_OFFSET); \ <<-- in_interrupt() returns true here
> trace_hardirq_enter(); \
> } while (0)
>
> Interesting enough, the dropped events happen to be in
> account_irq_enter_time()!
>
> Thus what I believe is happening is that an interrupt came in while one
> event was being recorded. When account_irq_enter_time was called, the
> function tracer noticed that its recursion bit for the current context
> was already set, and just dropped the event because it thought it was
> just tracing itself. After we add HARDIRQ_OFFSET to preempt_count, the
> "in_interrupt()" will be set and the function tracer will know its in a
> new context where its safe to continue tracing.
>
> Can you try this patch to see if it fixes it for you?
Hi Steve,
I finally took some time to play the patch, sorry for the delay. I got the idea
of the patch, but it is not working as expected :-(.
When I enable it, the system [a VM with 1 CPU] mostly freezes when I run that:
# while [ 1 ]; do echo > /dev/null; done &
I still need to investigate why.
The other point is that I got that the patch would start showing
account_irq_enter_time(). But, as far as I understood, it would not trace the
do_IRQ(). Right?
Wouldn't be the case of using a per-cpu variable to set the flag right in the
begin of the handler (in the entry*.s)?
Thoughts?
-- Daniel
Powered by blists - more mailing lists