lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 11 Mar 2022 18:42:26 +0100
From:   "Steinar H. Gunderson" <sesse@...gle.com>
To:     Adrian Hunter <adrian.hunter@...el.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...hat.com>,
        Namhyung Kim <namhyung@...nel.org>,
        linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] perf intel-pt: Synthesize cycle events

On Fri, Mar 11, 2022 at 11:10:30AM +0200, Adrian Hunter wrote:
>> @@ -1633,7 +1639,7 @@ static int intel_pt_synth_instruction_sample(struct intel_pt_queue *ptq)
>>  	else
>>  		sample.period = ptq->state->tot_insn_cnt - ptq->last_insn_cnt;
>>  
>> -	if (ptq->sample_ipc)
>> +	if (ptq->sample_ipc || pt->sample_cycles)
> This is not quite right.  ptq->sample_ipc is set to indicate when the
> cycle count is accurate for the current instruction.  It can be weakened
> by using "Approx IPC" which was introduced for dlfilter-show-cycles.
> Probably that approach should be followed for a "cycles" event also.

Thanks for the review!

I'm not sure if I understand this entirely. The point of the code here
is to get sample.cyc_cnt computed, even if we're not sampling IPC.
I've seen the approx IPC code, but I'm not entirely sure how it
interacts with this?

>>  		sample.cyc_cnt = ptq->ipc_cyc_cnt - ptq->last_in_cyc_cnt;
>>  	if (sample.cyc_cnt) {
>>  		sample.insn_cnt = ptq->ipc_insn_cnt - ptq->last_in_insn_cnt;
>> @@ -1643,8 +1649,30 @@ static int intel_pt_synth_instruction_sample(struct intel_pt_queue *ptq)
>>  
>>  	ptq->last_insn_cnt = ptq->state->tot_insn_cnt;
> There are variables here that are specific to the "instructions" event, so
> mixing "cycles" with "instructions" means duplicating those, however maybe
> it would be better not to allow "y" and "i" options at the same time?

Given that a decode can easily take an hour, it would be really nice to
be able to keep y and i at the same time :-) (A long-standing pet peeve
of mine in perf is that you can't show two events side-by-side;
oprofile did that back in the day, at least on annotations.)

What specifically do you mean by duplicating? That we need to calculate
them twice in a way I don't do in my current patch, or something else?
It feels like I'm missing some context here.

>> -	return intel_pt_deliver_synth_event(pt, event, &sample,
>> -					    pt->instructions_sample_type);
>> +	if (pt->sample_instructions) {
>> +		err = intel_pt_deliver_synth_event(pt, event, &sample,
>> +						   pt->instructions_sample_type);
>> +		if (err)
>> +			return err;
>> +	}
>> +
>> +	/*
>> +	 * NOTE: If not doing sampling (e.g. itrace=y0us), we will in practice
>> +	 * only see cycles being attributed to branches, since CYC packets
>> +	 * only are emitted together with other packets are emitted.
>> +	 * We should perhaps consider spreading it out over everything since
>> +	 * the last CYC packet, ie., since last time sample.cyc_cnt was nonzero.
>> +	 */
>> +	if (pt->sample_cycles && sample.cyc_cnt) {
>> +		sample.id = ptq->pt->cycles_id;
>> +		sample.stream_id = ptq->pt->cycles_id;
> A "cycles" sample needs to set the sample period to the number of cycles since the
> last "cycles" sample.

OK. If I understand you right, is this as simple as sample.period =
sample.cyc_cnt?

/* Steinar */

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ