[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGG=3QVqbuhocgx0sJmqEkTkHo0Q=K5+7+2X6ONvcX7cVZc1+w@mail.gmail.com>
Date: Mon, 14 Jun 2021 04:41:01 -0700
From: Bill Wendling <morbo@...gle.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Kees Cook <keescook@...gle.com>, Jonathan Corbet <corbet@....net>,
Masahiro Yamada <masahiroy@...nel.org>,
Linux Doc Mailing List <linux-doc@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
Linux Kbuild mailing list <linux-kbuild@...r.kernel.org>,
clang-built-linux <clang-built-linux@...glegroups.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Nathan Chancellor <natechancellor@...il.com>,
Nick Desaulniers <ndesaulniers@...gle.com>,
Sami Tolvanen <samitolvanen@...gle.com>,
Fangrui Song <maskray@...gle.com>,
"maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <x86@...nel.org>,
andreyknvl@...il.com, dvyukov@...gle.com, elver@...gle.com,
johannes.berg@...el.com, oberpar@...ux.vnet.ibm.com,
linux-toolchains@...r.kernel.org
Subject: Re: [PATCH v9] pgo: add clang's Profile Guided Optimization infrastructure
On Mon, Jun 14, 2021 at 3:45 AM Peter Zijlstra <peterz@...radead.org> wrote:
> On Mon, Jun 14, 2021 at 02:39:41AM -0700, Bill Wendling wrote:
> > On Mon, Jun 14, 2021 at 2:01 AM Peter Zijlstra <peterz@...radead.org> wrote:
>
> > > Because having GCOV, KCOV and PGO all do essentially the same thing
> > > differently, makes heaps of sense?
> > >
> > It does when you're dealing with one toolchain without access to another.
>
> Here's a sekrit, don't tell anyone, but you can get a free copy of GCC
> right here:
>
> https://gcc.gnu.org/
>
> We also have this linux-toolchains list (Cc'ed now) that contains folks
> from both sides.
>
Your sarcasm is not useful.
> > > I understand that the compilers actually generates radically different
> > > instrumentation for the various cases, but essentially they're all
> > > collecting (function/branch) arcs.
> > >
> > That's true, but there's no one format for profiling data that's
> > usable between all compilers. I'm not even sure there's a good way to
> > translate between, say, gcov and llvm's format. To make matters more
> > complicated, each compiler's format is tightly coupled to a specific
> > version of that compiler. And depending on *how* the data is collected
> > (e.g. sampling or instrumentation), it may not give us the full
> > benefit of FDO/PGO.
>
> I'm thinking that something simple like:
>
> struct arc {
> u64 from;
> u64 to;
> u64 nr;
> u64 cntrs[0];
> };
>
> goes a very long way. Stick a header on that says how large cntrs[] is,
> and some other data (like load offset and whatnot) and you should be
> good.
>
> Combine that with the executable image (say /proc/kcore) to recover
> what's @from (call, jmp or conditional branch) and I'm thinking one
> ought to be able to construct lots of useful data.
>
> I've also been led to believe that the KCOV data format is not in fact
> dependent on which toolchain is used.
>
> > > I'm thinking it might be about time to build _one_ infrastructure for
> > > that and define a kernel arc format and call it a day.
> > >
> > That may be nice, but it's a rather large request.
>
> Given GCOV just died, perhaps you can look at what KCOV does and see if
> that can be extended to do as you want. KCOV is actively used and
> we actually tripped over all the fun little noinstr bugs at the time.
Powered by blists - more mailing lists