[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2c057288-afe9-4117-8db3-5211fb82615c@linux.intel.com>
Date: Mon, 8 Dec 2025 14:00:45 +0800
From: "Mi, Dapeng" <dapeng1.mi@...ux.intel.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...hat.com>, Arnaldo Carvalho de Melo
<acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, Ian Rogers <irogers@...gle.com>,
Adrian Hunter <adrian.hunter@...el.com>, Jiri Olsa <jolsa@...nel.org>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Andi Kleen <ak@...ux.intel.com>, Eranian Stephane <eranian@...gle.com>,
Mark Rutland <mark.rutland@....com>, broonie@...nel.org,
Ravi Bangoria <ravi.bangoria@....com>, linux-kernel@...r.kernel.org,
linux-perf-users@...r.kernel.org, Zide Chen <zide.chen@...el.com>,
Falcon Thomas <thomas.falcon@...el.com>, Dapeng Mi <dapeng1.mi@...el.com>,
Xudong Hao <xudong.hao@...el.com>, Kan Liang <kan.liang@...ux.intel.com>
Subject: Re: [Patch v5 07/19] perf: Add sampling support for SIMD registers
On 12/5/2025 7:40 PM, Peter Zijlstra wrote:
> On Wed, Dec 03, 2025 at 02:54:48PM +0800, Dapeng Mi wrote:
>
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index 3e9c48fa2202..b19de038979e 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -7469,6 +7469,50 @@ perf_output_sample_regs(struct perf_output_handle *handle,
>> }
>> }
>>
>> +static void
>> +perf_output_sample_simd_regs(struct perf_output_handle *handle,
>> + struct perf_event *event,
>> + struct pt_regs *regs,
>> + u64 mask, u16 pred_mask)
>> +{
>> + u16 pred_qwords = event->attr.sample_simd_pred_reg_qwords;
>> + u16 vec_qwords = event->attr.sample_simd_vec_reg_qwords;
>> + u64 pred_bitmap = pred_mask;
>> + u64 bitmap = mask;
>> + u16 nr_vectors;
>> + u16 nr_pred;
>> + int bit;
>> + u64 val;
>> + u16 i;
>> +
>> + nr_vectors = hweight64(bitmap);
>> + nr_pred = hweight64(pred_bitmap);
>> +
>> + perf_output_put(handle, nr_vectors);
>> + perf_output_put(handle, vec_qwords);
>> + perf_output_put(handle, nr_pred);
>> + perf_output_put(handle, pred_qwords);
>> +
>> + if (nr_vectors) {
>> + for_each_set_bit(bit, (unsigned long *)&bitmap,
> This isn't right. Yes we do this all the time in the x86 code, but there
> we can assume little-endian byte order. This is core code and is also
> used on big-endian systems where this is very much broken.
Oh, yes. Just ignored the endians. Would fix it in next version. Thanks.
>
>> + sizeof(bitmap) * BITS_PER_BYTE) {
>> + for (i = 0; i < vec_qwords; i++) {
>> + val = perf_simd_reg_value(regs, bit, i, false);
>> + perf_output_put(handle, val);
>> + }
>> + }
>> + }
>> + if (nr_pred) {
>> + for_each_set_bit(bit, (unsigned long *)&pred_bitmap,
>> + sizeof(pred_bitmap) * BITS_PER_BYTE) {
>> + for (i = 0; i < pred_qwords; i++) {
>> + val = perf_simd_reg_value(regs, bit, i, true);
>> + perf_output_put(handle, val);
>> + }
>> + }
>> + }
>> +}
Powered by blists - more mailing lists