[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201109121237.GJ2594@hirez.programming.kicks-ass.net>
Date: Mon, 9 Nov 2020 13:12:37 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: David Laight <David.Laight@...lab.com>
Cc: Steven Rostedt <rostedt@...dmis.org>,
Jesper Dangaard Brouer <brouer@...hat.com>,
"mingo@...nel.org" <mingo@...nel.org>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"kan.liang@...ux.intel.com" <kan.liang@...ux.intel.com>,
"acme@...nel.org" <acme@...nel.org>,
"mark.rutland@....com" <mark.rutland@....com>,
"alexander.shishkin@...ux.intel.com"
<alexander.shishkin@...ux.intel.com>,
"jolsa@...hat.com" <jolsa@...hat.com>,
"namhyung@...nel.org" <namhyung@...nel.org>,
"ak@...ux.intel.com" <ak@...ux.intel.com>,
"eranian@...gle.com" <eranian@...gle.com>
Subject: Re: [PATCH 4/6] perf: Optimize get_recursion_context()
On Sat, Oct 31, 2020 at 12:11:42PM +0000, David Laight wrote:
> The gcc 7.5.0 I have handy probably generates the best code for:
>
> unsigned char q_2(unsigned int pc)
> {
> unsigned char rctx = 0;
>
> rctx += !!(pc & (NMI_MASK));
> rctx += !!(pc & (NMI_MASK | HARDIRQ_MASK));
> rctx += !!(pc & (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET));
>
> return rctx;
> }
>
> 0000000000000000 <q_2>:
> 0: f7 c7 00 00 f0 00 test $0xf00000,%edi # clock 0
> 6: 0f 95 c0 setne %al # clock 1
> 9: f7 c7 00 00 ff 00 test $0xff0000,%edi # clock 0
> f: 0f 95 c2 setne %dl # clock 1
> 12: 01 c2 add %eax,%edx # clock 2
> 14: 81 e7 00 01 ff 00 and $0xff0100,%edi
> 1a: 0f 95 c0 setne %al
> 1d: 01 d0 add %edx,%eax # clock 3
> 1f: c3 retq
>
> I doubt that is beatable.
>
> I've annotated the register dependency chain.
> Likely to be 3 (or maybe 4) clocks.
> The other versions are a lot worse (7 or 8) without allowing
> for 'sbb' taking 2 clocks on a lot of Intel cpus.
https://godbolt.org/z/EfnG8E
Recent GCC just doesn't want to do that. Still, using u8 makes sense, so
I've kept that.
Powered by blists - more mailing lists