lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210916135418.GA383600@leoy-ThinkPad-X240s>
Date:   Thu, 16 Sep 2021 21:54:18 +0800
From:   Leo Yan <leo.yan@...aro.org>
To:     Namhyung Kim <namhyung@...nel.org>
Cc:     Arnaldo Carvalho de Melo <acme@...nel.org>,
        Jiri Olsa <jolsa@...hat.com>, Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Andi Kleen <ak@...ux.intel.com>,
        Ian Rogers <irogers@...gle.com>,
        Stephane Eranian <eranian@...gle.com>,
        James Clark <james.clark@....com>,
        Tan Xiaojun <tanxiaojun@...wei.com>,
        Adrian Hunter <adrian.hunter@...el.com>
Subject: Re: [RFC] perf arm-spe: Track task context switch for cpu-mode events

Hi Namhyung,

On Wed, Sep 15, 2021 at 05:17:48PM -0700, Namhyung Kim wrote:
> When perf report synthesize events from ARM SPE data, it refers to
> current cpu, pid and tid in the machine.  But there's no place to set
> them in the ARM SPE decoder.  I'm seeing all pid/tid is set to -1 and
> user symbols are not resolved in the output.
> 
>   # perf record -a -e arm_spe_0/ts_enable=1/ sleep 1
> 
>   # perf report -q | head
>      8.77%     8.77%  :-1      [kernel.kallsyms]  [k] format_decode
>      7.02%     7.02%  :-1      [kernel.kallsyms]  [k] seq_printf
>      7.02%     7.02%  :-1      [unknown]          [.] 0x0000ffff9f687c34
>      5.26%     5.26%  :-1      [kernel.kallsyms]  [k] vsnprintf
>      3.51%     3.51%  :-1      [kernel.kallsyms]  [k] string
>      3.51%     3.51%  :-1      [unknown]          [.] 0x0000ffff9f66ae20
>      3.51%     3.51%  :-1      [unknown]          [.] 0x0000ffff9f670b3c
>      3.51%     3.51%  :-1      [unknown]          [.] 0x0000ffff9f67c040
>      1.75%     1.75%  :-1      [kernel.kallsyms]  [k] ___cache_free
>      1.75%     1.75%  :-1      [kernel.kallsyms]  [k] __count_memcg_events
> 
> Like Intel PT, add context switch records to track task info.  As ARM
> SPE support was added later than PERF_RECORD_SWITCH_CPU_WIDE, I think
> we can safely set the attr.context_switch bit and use it.

Thanks for the patch.

Before we had discussion for enabling PID/TID for SPE samples; in the patch
set [1], patches 07, 08 set sample's pid/tid based on the Arm SPE context
packets.  To enable hardware tracing context ID, you also needs to enable
kernel config CONFIG_PID_IN_CONTEXTIDR.

At that time, there have a concern is the hardware context ID might
introduce confusion for non-root namespace.

We also considered to use PERF_RECORD_SWITCH_CPU_WIDE event for setting
pid/tid, the Intel PT implementation uses two things to set sample's
pid/tid: one is PERF_RECORD_SWITCH_CPU_WIDE event and another is to detect
the branch instruction is the symbol "__switch_to".  Since the trace
event PERF_RECORD_SWITCH_CPU_WIDE is coarse, so it only uses the new
pid/tid after the branch instruction for "__switch_to".  Arm SPE is
'statistical', thus it cannot promise the trace data must contain the
branch instruction for "__switch_to", please see details [2].

I think the feasible way is to use CONTEXTIDR to trace PID/TID _only_
for root namespace, and the perf tool uses context packet to set
pid/tid for samples.  So except we need patches 07 and 08, we also
need a change in Arm SPE driver as below:

diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index d44bcc29d99c..2553d53d3772 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -272,7 +272,9 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
        if (!attr->exclude_kernel)
                reg |= BIT(SYS_PMSCR_EL1_E1SPE_SHIFT);
 
-       if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && perfmon_capable())
+       /* Only enable context ID tracing for root namespace */
+       if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && perfmon_capable() &&
+           (task_active_pid_ns(current) == &init_pid_ns))
                reg |= BIT(SYS_PMSCR_EL1_CX_SHIFT);
 
        return reg;

Could you confirm if this works for you?  If it's okay for you, I will
sync with James for upstreaming the changes.

Thanks,
Leo

[1] https://lore.kernel.org/lkml/20210119144658.793-8-james.clark@arm.com/
[2] https://lore.kernel.org/lkml/20210204102734.GA4737@leoy-ThinkPad-X240s/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ