[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJWu+ooMB7pDu0xAkqwZBgBO67CFW6A4AXa3hbKo-oSh93t5=Q@mail.gmail.com>
Date: Mon, 25 Sep 2017 17:23:00 -0700
From: Joel Fernandes <joelaf@...gle.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: LKML <linux-kernel@...r.kernel.org>, kernel-team@...roid.com,
Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [PATCH v6 2/2] tracing: Add support for preempt and irq
enable/disable events
Hi Peter,
On Mon, Sep 25, 2017 at 1:34 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> On Fri, Sep 22, 2017 at 11:28:30AM -0700, Joel Fernandes wrote:
[...]
>> >> void start_critical_timings(void)
>> >> {
>> >> + if (this_cpu_read(tracing_preempt_cpu))
>> >> + trace_preempt_enable_rcuidle(CALLER_ADDR0, CALLER_ADDR1);
>> >> +
>> >> + if (this_cpu_read(tracing_irq_cpu))
>> >> + trace_irq_enable_rcuidle(CALLER_ADDR0, CALLER_ADDR1);
>> >> +
>> >> start_critical_timings_tracer();
>> >> }
>> >> EXPORT_SYMBOL_GPL(start_critical_timings);
>> >>
>> >> void stop_critical_timings(void)
>> >> {
>> >> + if (this_cpu_read(tracing_preempt_cpu))
>> >> + trace_preempt_disable_rcuidle(CALLER_ADDR0, CALLER_ADDR1);
>> >> +
>> >> + if (this_cpu_read(tracing_irq_cpu))
>> >> + trace_irq_disable_rcuidle(CALLER_ADDR0, CALLER_ADDR1);
>> >> +
>> >> stop_critical_timings_tracer();
>> >> }
>> >> EXPORT_SYMBOL_GPL(stop_critical_timings);
>> >
>> > And I feel these yield somewhat odd semantics, does that need explaining
>> > somewhere?
>>
>> Maybe I can add a comment here, if you prefer that. When you meant
>> semantics, do you mean 'critical' vs 'atomic' thing or do you mean the
>> semantics/context of how this function is supposed to be used?
>
> I would add the comment to the tracepoint definition.
>
> On semantics, the whole stop/start excludes a fair bunch of walltime
> from our measurement, I feel that needs to be called out and enumerated
> (when time goes missing and why).
>
> Given that the idle thread runs with preempt-off I understand its
> purpose from the POV from the preempt-tracer, but its 'weird' behaviour
> if you're looking at it from a pure tracepoint pov.
I see your point, basically you have a problem with the "fake preempt
on" event that I'm generating when start_critical_timings is called
from the idle code path which isn't really a preempt-on per-se but is
an effect of entering the idle path.
Since we already have cpuidle events, one thing I could do is just
drop out this part of the patch set and not generate any trace events
during start_critical_timings, and one could use the preempt_disable
trace event (introduced in this path) and the cpuidle trace event to
figure out that part of the preempt-disabled section is actual the cpu
idle path. Would that alleviate your concern here?
As a next step Steven discussed with me in the hallway at LPC that we
could possibly a start_critical_timings event as well, so probably as
a next step after this patch, those new events can be added which
gives more visibility into walltime that shouldn't be counted. Or I
could add it in this series as a separate patch, let me know if that
makes sense and what you'd like to do, thanks.
>> >> void trace_preempt_off(unsigned long a0, unsigned long a1)
>> >> {
>> >> + if (this_cpu_read(tracing_preempt_cpu))
>> >> + return;
>> >> +
>> >> + this_cpu_write(tracing_preempt_cpu, 1);
>> >> +
>> >> + trace_preempt_disable_rcuidle(a0, a1);
>> >> tracer_preempt_off(a0, a1);
>> >> }
>> >> #endif
>> >
>> > And here you assume things like double on / double off don't happen,
>> > which might well be so, but does seem somewhat fragile.
>> >
>>
>> We are handling the cases where these functions might be called twice,
>> but we are only interested in the first time they're called. I caught
>> a dead lock happen when I didn't add such protection to
>> trace_hardirqs_off so I added to these to the trace_hardirqs* and
>> trace_preempt* ones as well to just to be extra safe and keep it
>> consistent. Hope I understood your concern correctly, if not please
>> let me know, thanks.
>
> Under what conditions where they called twice? That seems like something
> that should not happen to begin with. Esp the one I left quoted above,
Ok, so the calling twice AIUI cannot happen with trace_preempt_on and
trace_preempt_off, I agree and I'll drop the unnecessary per-cpu
variable check, but...
> disabling when its already disabled sounds like fail. So please provide
> more details on the scenario you're working around.
It can still happen with local_irq_save called while interrupts are disabled:
The trace_hardirqs_off API can be called even when IRQs are already
off. This is unlike the trace_hardirqs_on which checks if IRQs are off
(atleast from some callsites), here are the definitions just for
reference [1]. I guess we could modify local_irq_disable and
local_irq_save to check what the HW flags was before calling
raw_local_irq_save and only then call trace_hardirqs_off if they were
indeed on and now being turned off, but that adds complexity to it -
also we have to then modify all the callsites from assembly code to
conditionally call trace_hardirqs_on/off :(.
Instead my patch uses a per-CPU variable in trace_hardirqs_* function
to track that interrupts were already off when it was called thus
avoiding uselessly emitting a trace event:
void trace_hardirqs_off(void)
{
if (this_cpu_read(tracing_irq_cpu))
return;
this_cpu_write(tracing_irq_cpu, 1);
trace_irq_disable_rcuidle(CALLER_ADDR0, CALLER_ADDR1);
tracer_hardirqs_off();
}
I feel this is the cleanest path forward for the IRQ disable/enable
trace events.
Are you Ok with this?
thanks,
- Joel
[1] http://elixir.free-electrons.com/linux/latest/source/include/linux/irqflags.h#L89
Powered by blists - more mailing lists