[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0272153e-f2a5-67df-9402-16d3cc118713@linux.intel.com>
Date: Tue, 26 Jan 2021 11:21:21 -0500
From: "Liang, Kan" <kan.liang@...ux.intel.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: acme@...nel.org, mingo@...nel.org, linux-kernel@...r.kernel.org,
eranian@...gle.com, namhyung@...nel.org, jolsa@...hat.com,
ak@...ux.intel.com, yao.jin@...ux.intel.com
Subject: Re: [PATCH 03/12] perf/x86/intel: Add perf core PMU support for
Sapphire Rapids
On 1/26/2021 10:37 AM, Peter Zijlstra wrote:
> On Tue, Jan 19, 2021 at 12:38:22PM -0800, kan.liang@...ux.intel.com wrote:
>> @@ -1577,9 +1668,20 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event,
>> }
>>
>> if (format_size & PEBS_DATACFG_MEMINFO) {
>> + if (sample_type & PERF_SAMPLE_WEIGHT) {
>> + u64 weight = meminfo->latency;
>> +
>> + if (x86_pmu.flags & PMU_FL_INSTR_LATENCY)
>> + weight >>= PEBS_CACHE_LATENCY_OFFSET;
>> + data->weight = weight & PEBS_LATENCY_MASK ?:
>> intel_get_tsx_weight(meminfo->tsx_tuning);
>> + }
>> +
>> + if (sample_type & PERF_SAMPLE_WEIGHT_EXT) {
>> + data->weight_ext.val = 0;
>> + if (x86_pmu.flags & PMU_FL_INSTR_LATENCY)
>> + data->weight_ext.instr_latency = meminfo->latency & PEBS_LATENCY_MASK;
>> + }
>>
>> if (sample_type & PERF_SAMPLE_DATA_SRC)
>> data->data_src.val = get_data_src(event, meminfo->aux);
>
> Talk to me about that SAMPLE_WEIGHT stuff.... I'm not liking it.
>
> Sure you want multiple dimensions, but urgh.
>
> Also, afaict, as proposed you're wasting 80/128 bits. That is, all data
> you want to export fits in a single u64 and yet you're using two, which
> is mighty daft.
>
> Sure, pebs::lat / pebs_meminfo::latency is defined as a u64, but you
> can't tell me that that is ever actually more than 4G cycles. Even the
> TSX block latency is u32.
>
> So how about defining SAMPLE_WEIGHT_STRUCT which uses the exact same
> data as SAMPLE_WEIGHT but unions it with a struct. I'm not sure if we
> want:
>
> union sample_weight {
> u64 weight;
>
> struct {
> u32 low_dword;
> u32 high_dword;
> };
>
> /* or */
>
> struct {
> u32 low_dword;
> u16 high_word;
> u16 higher_word;
> };
> };
>
> Then have the core code enforce SAMPLE_WEIGHT ^ SAMPLE_WEIGHT_STRUCT and
> make the existing code never set the high dword.
So the kernel will only accept either SAMPLE_WEIGHT type or
SAMPLE_WEIGHT_STRUCT type. It should error out if both types are set, right?
I will check if u32 is enough for meminfo::latency on the previous
platforms.
Thanks,
Kan
Powered by blists - more mailing lists