linux-kernel - Re: [PATCH v6 2/2] tracing: Add support for preempt and irq enable/disable events

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170925083432.jvaewlsrb46wjmjj@hirez.programming.kicks-ass.net>
Date:   Mon, 25 Sep 2017 10:34:32 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Joel Fernandes <joelaf@...gle.com>
Cc:     LKML <linux-kernel@...r.kernel.org>, kernel-team@...roid.com,
        Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [PATCH v6 2/2] tracing: Add support for preempt and irq
 enable/disable events

On Fri, Sep 22, 2017 at 11:28:30AM -0700, Joel Fernandes wrote:
> >> +     TP_printk("caller=%pF parent=%pF",
> >> +               (void *)((unsigned long)(_stext) + __entry->caller_offs),
> >> +               (void *)((unsigned long)(_stext) + __entry->parent_offs))
> >> +);
> >
> > So I don't get that function prototype. Why do we need two IPs?
> 
> This follows the exact same "model" used by the preempt/irqsoff
> tracer. The rationale IIUC is a lot of times the actual preempt event
> is done by say a lock, but the actual caller of the lock function
> causing the preempt disable event is also of interest so this gives
> some more information of interest without needing the full stack, for
> example the spinlock acquiring in add_wait_queue disabled preemption
> here:
> 
>             grep-1041  [002] d..1    80.363455: preempt_disable:
> caller=_raw_spin_lock_irqsave+0x1d/0x40
> parent=add_wait_queue+0x15/0x50

I suppose that sort-of makes sense for the preempt-tracer, but its a
weird thing for a generic tracepoint.

> >>  void start_critical_timings(void)
> >>  {
> >> +     if (this_cpu_read(tracing_preempt_cpu))
> >> +             trace_preempt_enable_rcuidle(CALLER_ADDR0, CALLER_ADDR1);
> >> +
> >> +     if (this_cpu_read(tracing_irq_cpu))
> >> +             trace_irq_enable_rcuidle(CALLER_ADDR0, CALLER_ADDR1);
> >> +
> >>       start_critical_timings_tracer();
> >>  }
> >>  EXPORT_SYMBOL_GPL(start_critical_timings);
> >>
> >>  void stop_critical_timings(void)
> >>  {
> >> +     if (this_cpu_read(tracing_preempt_cpu))
> >> +             trace_preempt_disable_rcuidle(CALLER_ADDR0, CALLER_ADDR1);
> >> +
> >> +     if (this_cpu_read(tracing_irq_cpu))
> >> +             trace_irq_disable_rcuidle(CALLER_ADDR0, CALLER_ADDR1);
> >> +
> >>       stop_critical_timings_tracer();
> >>  }
> >>  EXPORT_SYMBOL_GPL(stop_critical_timings);
> >
> > And I feel these yield somewhat odd semantics, does that need explaining
> > somewhere?
> 
> Maybe I can add a comment here, if you prefer that. When you meant
> semantics, do you mean 'critical' vs 'atomic' thing or do you mean the
> semantics/context of how this function is supposed to be used?

I would add the comment to the tracepoint definition.

On semantics, the whole stop/start excludes a fair bunch of walltime
from our measurement, I feel that needs to be called out and enumerated
(when time goes missing and why).

Given that the idle thread runs with preempt-off I understand its
purpose from the POV from the preempt-tracer, but its 'weird' behaviour
if you're looking at it from a pure tracepoint pov.

> >>  void trace_preempt_off(unsigned long a0, unsigned long a1)
> >>  {
> >> +     if (this_cpu_read(tracing_preempt_cpu))
> >> +             return;
> >> +
> >> +     this_cpu_write(tracing_preempt_cpu, 1);
> >> +
> >> +     trace_preempt_disable_rcuidle(a0, a1);
> >>       tracer_preempt_off(a0, a1);
> >>  }
> >>  #endif
> >
> > And here you assume things like double on / double off don't happen,
> > which might well be so, but does seem somewhat fragile.
> >
> 
> We are handling the cases where these functions might be called twice,
> but we are only interested in the first time they're called. I caught
> a dead lock happen when I didn't add such protection to
> trace_hardirqs_off so I added to these to the trace_hardirqs* and
> trace_preempt* ones as well to just to be extra safe and keep it
> consistent. Hope I understood your concern correctly, if not please
> let me know, thanks.

Under what conditions where they called twice? That seems like something
that should not happen to begin with. Esp the one I left quoted above,
disabling when its already disabled sounds like fail. So please provide
more details on the scenario you're working around.