[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9d267395-fda8-d111-52a2-e7cdcdf7d24b@linux.intel.com>
Date: Wed, 27 Mar 2019 10:25:46 -0400
From: "Liang, Kan" <kan.liang@...ux.intel.com>
To: Andi Kleen <ak@...ux.intel.com>
Cc: peterz@...radead.org, acme@...nel.org, mingo@...hat.com,
linux-kernel@...r.kernel.org, tglx@...utronix.de, jolsa@...nel.org,
eranian@...gle.com, alexander.shishkin@...ux.intel.com
Subject: Re: [PATCH V4 04/23] perf/x86/intel: Support adaptive PEBSv4
On 3/26/2019 6:24 PM, Andi Kleen wrote:
>> + for (at = base; at < top; at += cpuc->pebs_record_size) {
>> + u64 pebs_status;
>> +
>> + pebs_status = get_pebs_status(at) & cpuc->pebs_enabled;
>> + pebs_status &= mask;
>> +
>> + for_each_set_bit(bit, (unsigned long *)&pebs_status, size)
>> + counts[bit]++;
>> + }
>
> On Icelake pebs_status is always reliable, so I don't think we need
> the two pass walking.
>
We need to call perf_event_overflow() for the last record of each event.
It's hard to detect which record is the last record of the event with
one pass walking.
Also, I'm not sure how much we can save with one pass walking. The
optimization should only benefit large PEBS. The total number of records
for large PEBS should not be huge.
I will evaluate the performance impact of one pass walking. If there is
observed performance improvement, I will submit a separate patch later.
For now, I think we can still use the mature two pass walking method.
Thanks,
Kan
> -Andi
>
>> +
>> + for (bit = 0; bit < size; bit++) {
>> + if (counts[bit] == 0)
>> + continue;
>> +
>> + event = cpuc->events[bit];
>> + if (WARN_ON_ONCE(!event))
>> + continue;
>> +
>> + if (WARN_ON_ONCE(!event->attr.precise_ip))
>> + continue;
>> +
>> + __intel_pmu_pebs_event(event, iregs, base,
>> + top, bit, counts[bit],
>> + setup_pebs_adaptive_sample_data);
>> + }
>> +}
Powered by blists - more mailing lists