lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 16 Sep 2021 14:01:21 -0700
From:   Namhyung Kim <namhyung@...nel.org>
To:     Leo Yan <leo.yan@...aro.org>
Cc:     Arnaldo Carvalho de Melo <acme@...nel.org>,
        Jiri Olsa <jolsa@...hat.com>, Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Andi Kleen <ak@...ux.intel.com>,
        Ian Rogers <irogers@...gle.com>,
        Stephane Eranian <eranian@...gle.com>,
        James Clark <james.clark@....com>,
        Adrian Hunter <adrian.hunter@...el.com>
Subject: Re: [RFC] perf arm-spe: Track task context switch for cpu-mode events

Hi Leo,

(Removing Tan as the email bounced)

On Thu, Sep 16, 2021 at 6:54 AM Leo Yan <leo.yan@...aro.org> wrote:
>
> Hi Namhyung,
>
> On Wed, Sep 15, 2021 at 05:17:48PM -0700, Namhyung Kim wrote:
> > When perf report synthesize events from ARM SPE data, it refers to
> > current cpu, pid and tid in the machine.  But there's no place to set
> > them in the ARM SPE decoder.  I'm seeing all pid/tid is set to -1 and
> > user symbols are not resolved in the output.
> >
> >   # perf record -a -e arm_spe_0/ts_enable=1/ sleep 1
> >
> >   # perf report -q | head
> >      8.77%     8.77%  :-1      [kernel.kallsyms]  [k] format_decode
> >      7.02%     7.02%  :-1      [kernel.kallsyms]  [k] seq_printf
> >      7.02%     7.02%  :-1      [unknown]          [.] 0x0000ffff9f687c34
> >      5.26%     5.26%  :-1      [kernel.kallsyms]  [k] vsnprintf
> >      3.51%     3.51%  :-1      [kernel.kallsyms]  [k] string
> >      3.51%     3.51%  :-1      [unknown]          [.] 0x0000ffff9f66ae20
> >      3.51%     3.51%  :-1      [unknown]          [.] 0x0000ffff9f670b3c
> >      3.51%     3.51%  :-1      [unknown]          [.] 0x0000ffff9f67c040
> >      1.75%     1.75%  :-1      [kernel.kallsyms]  [k] ___cache_free
> >      1.75%     1.75%  :-1      [kernel.kallsyms]  [k] __count_memcg_events
> >
> > Like Intel PT, add context switch records to track task info.  As ARM
> > SPE support was added later than PERF_RECORD_SWITCH_CPU_WIDE, I think
> > we can safely set the attr.context_switch bit and use it.
>
> Thanks for the patch.
>
> Before we had discussion for enabling PID/TID for SPE samples; in the patch
> set [1], patches 07, 08 set sample's pid/tid based on the Arm SPE context
> packets.  To enable hardware tracing context ID, you also needs to enable
> kernel config CONFIG_PID_IN_CONTEXTIDR.

Thanks for sharing this.

Yeah I also look at the context info but having a dependency on a kconfig
looks limiting its functionality.  Also the kconfig says it has some overhead
in the critical path (even if perf is not running, right?) - but not sure how
much it can add.

config PID_IN_CONTEXTIDR
    bool "Write the current PID to the CONTEXTIDR register"
    help
      Enabling this option causes the kernel to write the current PID to
      the CONTEXTIDR register, at the expense of some additional
      instructions during context switch. Say Y here only if you are
      planning to use hardware trace tools with this kernel.

>
> At that time, there have a concern is the hardware context ID might
> introduce confusion for non-root namespace.

Sounds like a problem.

>
> We also considered to use PERF_RECORD_SWITCH_CPU_WIDE event for setting
> pid/tid, the Intel PT implementation uses two things to set sample's
> pid/tid: one is PERF_RECORD_SWITCH_CPU_WIDE event and another is to detect
> the branch instruction is the symbol "__switch_to".  Since the trace
> event PERF_RECORD_SWITCH_CPU_WIDE is coarse, so it only uses the new
> pid/tid after the branch instruction for "__switch_to".  Arm SPE is
> 'statistical', thus it cannot promise the trace data must contain the
> branch instruction for "__switch_to", please see details [2].

I can see the need in the Intel PT as it needs to trace all (branch)
instructions, but is it really needed for ARM SPE too?
Maybe I am missing something, but it seems enough to have a
coarse-grained context switch for sampling events..

>
> I think the feasible way is to use CONTEXTIDR to trace PID/TID _only_
> for root namespace, and the perf tool uses context packet to set
> pid/tid for samples.  So except we need patches 07 and 08, we also
> need a change in Arm SPE driver as below:
>
> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
> index d44bcc29d99c..2553d53d3772 100644
> --- a/drivers/perf/arm_spe_pmu.c
> +++ b/drivers/perf/arm_spe_pmu.c
> @@ -272,7 +272,9 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
>         if (!attr->exclude_kernel)
>                 reg |= BIT(SYS_PMSCR_EL1_E1SPE_SHIFT);
>
> -       if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && perfmon_capable())
> +       /* Only enable context ID tracing for root namespace */
> +       if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && perfmon_capable() &&
> +           (task_active_pid_ns(current) == &init_pid_ns))
>                 reg |= BIT(SYS_PMSCR_EL1_CX_SHIFT);
>
>         return reg;
>
> Could you confirm if this works for you?  If it's okay for you, I will
> sync with James for upstreaming the changes.

Let me think about this more..

Thanks,
Namhyung


>
> Thanks,
> Leo
>
> [1] https://lore.kernel.org/lkml/20210119144658.793-8-james.clark@arm.com/
> [2] https://lore.kernel.org/lkml/20210204102734.GA4737@leoy-ThinkPad-X240s/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ