[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220315112232.GF8939@worktop.programming.kicks-ass.net>
Date: Tue, 15 Mar 2022 12:22:32 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Anshuman Khandual <anshuman.khandual@....com>
Cc: linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
acme@...nel.org, Robin Murphy <robin.murphy@....com>,
Suzuki Poulose <suzuki.poulose@....com>,
James Clark <james.clark@....com>,
Ingo Molnar <mingo@...hat.com>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...hat.com>,
Namhyung Kim <namhyung@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Will Deacon <will@...nel.org>,
linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH V4 03/10] perf: Extend branch type classification
On Tue, Mar 15, 2022 at 11:05:09AM +0530, Anshuman Khandual wrote:
> branch_entry.type now has ran out of space to accommodate more branch types
> classification. This will prevent perf branch stack implementation on arm64
> (via BRBE) to capture all available branch types. Extending this bit field
> i.e branch_entry.type [4 bits] is not an option as it will break user space
> ABI both for little and big endian perf tools.
>
> Extend branch classification with a new field branch_entry.new_type via a
> new branch type PERF_BR_EXTEND_ABI in branch_entry.type. Perf tools which
> could decode PERF_BR_EXTEND_ABI, will then parse branch_entry.new_type as
> well.
>
> branch_entry.new_type is a 4 bit field which can hold upto 16 branch types.
> The first three branch types will hold various generic page faults followed
> by five architecture specific branch types, which can be overridden by the
> platform for specific use cases. These architecture specific branch types
> gets overridden on arm64 platform for BRBE implementation.
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 26d8f0b5ac0d..d29280adc3c4 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -255,9 +255,22 @@ enum {
> PERF_BR_IRQ = 12, /* irq */
> PERF_BR_SERROR = 13, /* system error */
> PERF_BR_NO_TX = 14, /* not in transaction */
> + PERF_BR_EXTEND_ABI = 15, /* extend ABI */
> PERF_BR_MAX,
> };
> #define PERF_SAMPLE_BRANCH_PLM_ALL \
> (PERF_SAMPLE_BRANCH_USER|\
> PERF_SAMPLE_BRANCH_KERNEL|\
> @@ -1372,7 +1385,8 @@ struct perf_branch_entry {
> abort:1, /* transaction abort */
> cycles:16, /* cycle count to last branch */
> type:4, /* branch type */
> - reserved:40;
> + new_type:4, /* additional branch type */
> + reserved:36;
> };
Hurmpf... this will effectively give us 5 bits of space for the cost of
8, that seems... unfortunate.
Would something like:
type:4,
ext_type:4,
reserved:36;
and have all software do:
type = pbe->type | (pbe->ext_type << 4);
Then old software will only know about the old types. New software on
old kernels will add 4 0's, which is harmless, while new software on new
kernels will get 8 bytes of type.
Would that work?
Powered by blists - more mailing lists