[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220701193409.p4ejod7olx7ngl5m@google.com>
Date: Fri, 1 Jul 2022 12:34:09 -0700
From: Fangrui Song <maskray@...gle.com>
To: Bill Wendling <morbo@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
"Jose E. Marchesi" <jemarch@....org>,
Ruud van der Pas <ruud.vanderpas@...cle.com>,
Nick Desaulniers <ndesaulniers@...gle.com>,
Sami Tolvanen <samitolvanen@...gle.com>,
Vladimir Mezentsev <vladimir.mezentsev@...cle.com>,
clang-built-linux <llvm@...ts.linux.dev>,
LKML <linux-kernel@...r.kernel.org>, Yonghong Song <yhs@...com>,
Wenlei He <wenlei@...com>, Hongtao Yu <hoy@...com>,
Ingo Molnar <mingo@...nel.org>,
linux-toolchains <linux-toolchains@...r.kernel.org>,
elena.zannoni@...cle.com
Subject: Re: plumbers session on profiling?
On 2022-07-01, Bill Wendling wrote:
>On Fri, Jul 1, 2022 at 4:49 AM Peter Zijlstra <peterz@...radead.org> wrote:
>> On Fri, Jul 01, 2022 at 03:17:54AM -0700, Bill Wendling wrote:
>> > On Fri, Jul 1, 2022 at 2:02 AM Peter Zijlstra <peterz@...radead.org> wrote:
>> > >
>> > > On Tue, Jun 28, 2022 at 07:08:48PM +0200, Jose E. Marchesi wrote:
>> > > >
>> > > > [Added linux-toolchains@...r in CC]
>> > > >
>> > > > It would be interesting to have some discussion in the Toolchains track
>> > > > on building the kernel with PGO/FDO. I have seen a raise on interest on
>> > > > the topic in several companies, but it would make very little sense if
>> > > > no kernel hacker is interested in participating... anybody?
>> > >
>> > > I know there's been a lot of work in this area, but none of it seems to
>> > > have trickled down to be easy enough for me to use it.
>> >
>> > We use an instrumented kernel to collect the data we need. It gives us
>> > the best payoff, because the profiling data is more fine-grained and
>> > accurate. (PGO does much more than make inlining decisions.)
>> >
>> > If I recall correctly, you previously suggested using sampling data.
>> > (Correct?) Is there a document or article that outlines that process?
>>
>> IIRC Google has LBR sample driven PGO somewhere as well. ISTR that being
>> the whole motivation for that gruesome Zen3 BRS hack.
>>
>> Google got me this: https://research.google.com/pubs/archive/45290.pdf
>>
I very support that the mainline kernel adds instrumentation based PGO
but I vaguely recall that it was NAKed by Linus (because he thought
sample based is better).
>Right. However, there's a chicken-and-egg issue with AutoFDO for the
>production kernel. We can't release a kernel that hasn't been compiled
>with PGO/FDO. We could only release it in a test environment, in which
>case we could use AutoFDO. However, the document says that AutoFDO
>only reaches ~90% of FDO. They list some reasons for this, but
>nonetheless I suspect that the delta would be too severe for us to
>release the kernel.
>
>As for LBR, that will work with Intel/AMD, but I thought that LBR
>doesn't exist for Arm processors (my knowledge could be out of date on
>this).
Some folks try using Embedded Trace Macrocells.
I am not at all familiar with it but it seems that retrieving profiles
is not easy. The needed efforts seem even higher than using
instrumentation based PGO.
Instrumentation based PGO has the nice property that it works with all
architectures (the compiler supports) and does not rely on hardware
support. In addition, it collects indirect call targets and string
operation sizes which are very difficult/impossible for sample based
PGO.
>What would make PGO (sample-based or instrumented) easy enough for you
>to use? What're the key elements missing?
>
>-bw
>
Powered by blists - more mailing lists