[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200724144503.GD1180481@tassilo.jf.intel.com>
Date: Fri, 24 Jul 2020 07:45:03 -0700
From: Andi Kleen <ak@...ux.intel.com>
To: Ian Rogers <irogers@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...hat.com>,
Namhyung Kim <namhyung@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
linux-kernel@...r.kernel.org, Stephane Eranian <eranian@...gle.com>
Subject: Re: [PATCH] perf bench: Add benchmark of find_next_bit
On Fri, Jul 24, 2020 at 12:19:59AM -0700, Ian Rogers wrote:
> for_each_set_bit, or similar functions like for_each_cpu, may be hot
> within the kernel. If many bits were set then one could imagine on
> Intel a "bt" instruction with every bit may be faster than the function
> call and word length find_next_bit logic. Add a benchmark to measure
> this.
> This benchmark on AMD rome and Intel skylakex shows "bt" is not a good
> option except for very small bitmaps.
Small bitmaps is a common case in the kernel (e.g. cpu bitmaps)
But the current code isn't that great for small bitmaps. It always looks horrific
when I look at PT traces or brstackinsn, especially since it was optimized
purely for code size at some point.
Probably would be better to have different implementations for
different sizes.
-Andi
Powered by blists - more mailing lists