lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 6 Oct 2021 17:36:20 +0800
From:   Leo Yan <leo.yan@...aro.org>
To:     German Gomez <german.gomez@....com>
Cc:     James Clark <james.clark@....com>,
        Namhyung Kim <namhyung@...nel.org>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Jiri Olsa <jolsa@...hat.com>, Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Andi Kleen <ak@...ux.intel.com>,
        Ian Rogers <irogers@...gle.com>,
        Stephane Eranian <eranian@...gle.com>,
        Adrian Hunter <adrian.hunter@...el.com>
Subject: Re: [RFC] perf arm-spe: Track task context switch for cpu-mode events

Hi German,

On Tue, Oct 05, 2021 at 11:06:12AM +0100, German Gomez wrote:

[...]

> Yesterday we did some testing and found that there seems to be an exact
> match between using context packets and switch events. However this only
> applies when tracing in userspace (by adding the 'u' suffix to the perf
> event). Otherwise we still see as much as 2% of events having the wrong
> PID around the time of the switch.

This result sounds reasonable for me, if we only trace the userspace,
the result should have no any difference between using context packets
and switch events.

It's a bit high deviation with switch events (1.30% as shown in below
result) after enable kernel tracing.

> In order to measure this I applied Namhyung's patch and James's patch
> from [1]. Then added a printf line to the function arm_spe_prep_sample
> where I have access to both PID values, as a quick way to compare them
> later in a perf-report run. This is the diff of the printf patch:
> 
> diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
> index 41385ab96fbc..591985c66ac4 100644
> --- a/tools/perf/util/arm-spe.c
> +++ b/tools/perf/util/arm-spe.c
> @@ -247,6 +247,8 @@ static void arm_spe_prep_sample(struct arm_spe *spe,
>     event->sample.header.type = PERF_RECORD_SAMPLE;
>     event->sample.header.misc = sample->cpumode;
>     event->sample.header.size = sizeof(struct perf_event_header);
> +
> +       printf(">>>>>> %d / %lu\n", speq->tid, record->context_id & 0x7fff);
>  }
> 
> The differences obtained as error % were obtained by running the
> following perf-record commands for different configurations:
> 
> $ sudo ./perf record -e arm_spe/ts_enable=1,load_filter=1,store_filter=1/u -a -- sleep 60
> $ sudo ./perf report --stdio \
>     | grep ">>>>>>" \
>     | awk '{total++; if($2!=$4) miss++} END {print "Error: " (100*miss/total) "% out of " total " samples"}'
> 
> Error: 0% out of 11839328 samples
> 
> $ sudo ./perf record -e arm_spe/ts_enable=1,load_filter=1,store_filter=1/ -a -- sleep 10
> $ sudo ./perf report --stdio \
>     | grep ">>>>>>" \
>     | awk '{total++; if($2!=$4) miss++} END {print "Error: " (100*miss/total) "% out of " total " samples"}'
> 
> Error: 1.30624% out of 3418731 samples

Thanks for sharing this!

> I think the fallback to using switch when we can't use the CONTEXTIDR
> register is a viable option for userspace events, but maybe not so much
> for non-userspace.

Agreed.

If so, it's good to check the variable
'evsel->core.attr.exclude_kernel' when decode Arm SPE trace data, and
only use context switch event when 'exclude_kernel' is set.

Here should note one thing is the perf tool needs to have knowledge to
decide if the bit 3 'CX' (macro 'SYS_PMSCR_EL1_CX_SHIFT' in kernel) has
been set in register PMSCR or not.  AFAIK, Arm SPE driver doesn't
expose any interface (or config) to userspace for the context tracing,
so one method is to add an extra config in Arm SPE driver for this
bit, e.g. 'ATTR_CFG_FLD_cx_enable_CFG' can be added in Arm SPE driver.

Alternatively, rather than adding new config, I am just wandering we
simply use two flags in perf's decoding: 'use_switch_event_for_pid' and
'use_ctx_packet_for_pid', the first variable will be set if detects
the tracing is userspace only, the second varaible will be set when
detects the hardware tracing containing context packet.  So if the
variable 'use_ctx_packet_for_pid' has been set, then the decoder will
always use context packet for sample's PID, otherwise, it falls back
to check 'use_switch_event_for_pid' and set sample PID based on switch
events.

If have any other idea, please feel free bring up.

Thanks,
Leo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ