[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200625082433.GC117543@hirez.programming.kicks-ass.net>
Date: Thu, 25 Jun 2020 10:24:33 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Nick Desaulniers <ndesaulniers@...gle.com>
Cc: Sami Tolvanen <samitolvanen@...gle.com>,
Masahiro Yamada <masahiroy@...nel.org>,
Will Deacon <will@...nel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
"Paul E. McKenney" <paulmck@...nel.org>,
Kees Cook <keescook@...omium.org>,
clang-built-linux <clang-built-linux@...glegroups.com>,
Kernel Hardening <kernel-hardening@...ts.openwall.com>,
linux-arch <linux-arch@...r.kernel.org>,
Linux ARM <linux-arm-kernel@...ts.infradead.org>,
Linux Kbuild mailing list <linux-kbuild@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>, linux-pci@...r.kernel.org,
"maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <x86@...nel.org>
Subject: Re: [PATCH 00/22] add support for Clang LTO
On Thu, Jun 25, 2020 at 10:03:13AM +0200, Peter Zijlstra wrote:
> On Wed, Jun 24, 2020 at 02:31:36PM -0700, Nick Desaulniers wrote:
> > On Wed, Jun 24, 2020 at 2:15 PM Peter Zijlstra <peterz@...radead.org> wrote:
> > >
> > > On Wed, Jun 24, 2020 at 01:31:38PM -0700, Sami Tolvanen wrote:
> > > > This patch series adds support for building x86_64 and arm64 kernels
> > > > with Clang's Link Time Optimization (LTO).
> > > >
> > > > In addition to performance, the primary motivation for LTO is to allow
> > > > Clang's Control-Flow Integrity (CFI) to be used in the kernel. Google's
> > > > Pixel devices have shipped with LTO+CFI kernels since 2018.
> > > >
> > > > Most of the patches are build system changes for handling LLVM bitcode,
> > > > which Clang produces with LTO instead of ELF object files, postponing
> > > > ELF processing until a later stage, and ensuring initcall ordering.
> > > >
> > > > Note that first objtool patch in the series is already in linux-next,
> > > > but as it's needed with LTO, I'm including it also here to make testing
> > > > easier.
> > >
> > > I'm very sad that yet again, memory ordering isn't addressed. LTO vastly
> > > increases the range of the optimizer to wreck things.
> >
> > Hi Peter, could you expand on the issue for the folks on the thread?
> > I'm happy to try to hack something up in LLVM if we check that X does
> > or does not happen; maybe we can even come up with some concrete test
> > cases that can be added to LLVM's codebase?
>
> I'm sure Will will respond, but the basic issue is the trainwreck C11
> made of dependent loads.
>
> Anyway, here's a link to the last time this came up:
>
> https://lore.kernel.org/linux-arm-kernel/20171116174830.GX3624@linux.vnet.ibm.com/
Another good read:
https://lore.kernel.org/lkml/20150520005510.GA23559@linux.vnet.ibm.com/
and having (partially) re-read that, I now worry intensily about things
like latch_tree_find(), cyc2ns_read_begin, __ktime_get_fast_ns().
It looks like kernel/time/sched_clock.c uses raw_read_seqcount() which
deviates from the above patterns by, for some reason, using a primitive
that includes an extra smp_rmb().
And this is just the few things I could remember off the top of my head,
who knows what else is out there.
Powered by blists - more mailing lists