lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0ca8c095-9165-bcd1-26bc-3e5d2191affb@arm.com>
Date:   Fri, 16 Apr 2021 13:16:36 +0300
From:   James Clark <james.clark@....com>
To:     Mathieu Poirier <mathieu.poirier@...aro.org>
Cc:     coresight@...ts.linaro.org, al.grant@....com,
        branislav.rankov@....com, denik@...omium.org,
        suzuki.poulose@....com, Mike Leach <mike.leach@...aro.org>,
        Leo Yan <leo.yan@...aro.org>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...hat.com>,
        Namhyung Kim <namhyung@...nel.org>,
        John Garry <john.garry@...wei.com>,
        Will Deacon <will@...nel.org>,
        linux-arm-kernel@...ts.infradead.org,
        linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] perf cs-etm: Set time on synthesised samples to
 preserve ordering



On 15/04/2021 22:54, Mathieu Poirier wrote:
> On Wed, Apr 14, 2021 at 05:39:19PM +0300, James Clark wrote:
>> The following attribute is set when synthesising samples in
>> timed decoding mode:
>>
>>     attr.sample_type |= PERF_SAMPLE_TIME;
>>
>> This results in new samples that appear to have timestamps but
>> because we don't assign any timestamps to the samples, when the
>> resulting inject file is opened again, the synthesised samples
>> will be on the wrong side of the MMAP or COMM events.
>>
> 
> I understand the problem.  Once again an issue caused by CS and the kernel
> having a different view of time. 
> 
>> For example this results in the samples being associated with
>> the perf binary, rather than the target of the record:
>>
>>     perf record -e cs_etm/@..._etr0/u top
>>     perf inject -i perf.data -o perf.inject --itrace=i100il
>>     perf report -i perf.inject
>>
>> Where 'Command' == perf should show as 'top':
>>
>>     # Overhead  Command  Source Shared Object  Source Symbol           Target Symbol           Basic Block Cycles
>>     # ........  .......  ....................  ......................  ......................  ..................
>>     #
>>         31.08%  perf     [unknown]             [.] 0x000000000040c3f8  [.] 0x000000000040c3e8  -
>>
>> If the perf.data file is opened directly with perf, without the
>> inject step, then this already works correctly because the
>> events are synthesised after the COMM and MMAP events and
>> no second sorting happens. Re-sorting only happens when opening
>> the perf.inject file for the second time so timestamps are
>> needed.
>>
>> Using the timestamp from the AUX record mirrors the current
>> behaviour when opening directly with perf, because the events
>> are generated on the call to cs_etm__process_queues().
>>
>> Signed-off-by: James Clark <james.clark@....com>
>> Co-developed-by: Al Grant <al.grant@....com>
>> Signed-off-by: Al Grant <al.grant@....com>
> 
> Suzuki is correct, your name has to appear after Al's.
> 
>> ---
>>  tools/perf/util/cs-etm.c | 10 ++++++++--
>>  1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
>> index c25da2ffa8f3..d0fa9dce47f1 100644
>> --- a/tools/perf/util/cs-etm.c
>> +++ b/tools/perf/util/cs-etm.c
>> @@ -54,6 +54,7 @@ struct cs_etm_auxtrace {
>>  	u8 sample_instructions;
>>  
>>  	int num_cpu;
>> +	u64 latest_kernel_timestamp;
>>  	u32 auxtrace_type;
>>  	u64 branches_sample_type;
>>  	u64 branches_id;
>> @@ -1192,6 +1193,8 @@ static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq,
>>  	event->sample.header.misc = cs_etm__cpu_mode(etmq, addr);
>>  	event->sample.header.size = sizeof(struct perf_event_header);
>>  
>> +	if (!etm->timeless_decoding)
>> +		sample.time = etm->latest_kernel_timestamp;
>>  	sample.ip = addr;
>>  	sample.pid = tidq->pid;
>>  	sample.tid = tidq->tid;
>> @@ -1248,6 +1251,8 @@ static int cs_etm__synth_branch_sample(struct cs_etm_queue *etmq,
>>  	event->sample.header.misc = cs_etm__cpu_mode(etmq, ip);
>>  	event->sample.header.size = sizeof(struct perf_event_header);
>>  
>> +	if (!etm->timeless_decoding)
>> +		sample.time = etm->latest_kernel_timestamp;
>>  	sample.ip = ip;
>>  	sample.pid = tidq->pid;
>>  	sample.tid = tidq->tid;
>> @@ -2412,9 +2417,10 @@ static int cs_etm__process_event(struct perf_session *session,
>>  	else if (event->header.type == PERF_RECORD_SWITCH_CPU_WIDE)
>>  		return cs_etm__process_switch_cpu_wide(etm, event);
>>  
>> -	if (!etm->timeless_decoding &&
>> -	    event->header.type == PERF_RECORD_AUX)
>> +	if (!etm->timeless_decoding && event->header.type == PERF_RECORD_AUX) {
>> +		etm->latest_kernel_timestamp = sample_kernel_timestamp;
> 
> It will be fun to fix this when 8.4 comes out but for now it's the best we've
> got.
> 

Thanks for the reviews Leo, Mathieu and Suzuki. Yes I think for 8.4 we can also do something
very similar to Leo's "perf arm-spe: Bail out if the trace is later than perf event"
patch where decoding is paused until the other events with later timestamps
have been received. At the moment the CS decoding happens all at once.

I will submit a new set with the fixes and better variable name.

James

> Reviewed-by: Mathieu Poirier <mathieu.poirier@...aro.org>
> 
>>  		return cs_etm__process_queues(etm);
>> +	}
>>  
>>  	return 0;
>>  }
>> -- 
>> 2.28.0
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ