lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFxhUDTW_Pa9-+jmXhNDDTy5nrkiSaswxRTHh7u+j8gnOA@mail.gmail.com>
Date:   Mon, 5 Feb 2018 09:19:05 -0800
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     David Laight <David.Laight@...lab.com>
Cc:     Linus Walleij <linus.walleij@...aro.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Ingo Molnar <mingo@...nel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        "linux-gpio@...r.kernel.org" <linux-gpio@...r.kernel.org>
Subject: Re: [GIT PULL] pin control bulk changes for v4.16

On Mon, Feb 5, 2018 at 8:55 AM, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> End result: opening a file - whether it exists or not - doesn't
> actually go down to the filesystem at all when things are cached. My
> kernel profiles also show that very clearly, there's absolutely no
> filesystem component to the build at all (but there is a noticeable
> VFS component to it, and __d_lookup_rcu is generally one of the
> hottest kernel functions along with the system call entry/exit code).

Note that when I do kernel profiles of kernel builds, I do it mostly
for the "everything is already built" case, so the real footprint for
much of my profiles is actually mostly "make" doing millions of
open/stat calls.

Because once you actually build things, the kernel is almost not
noticeable any more. It's all gcc. And people always say that it's
optimizations that are expensive, but from the profiling I've done of
user space, a _lot_ of time is spent in just parsing and reading the
data.

In fact, having just re-done this, the top function in profiling is
"_cpp_lex_token()" at 3.4% of overall time for my test kernel build.

That matches my experience from sparse: the real overhead in a
compiler is just the stupid lexing/parsing. Cache misses galore, and
there's nothing really smart you can do about it.

Once you get to optimization, you can do smart things like hash the
SSA representation to do CSE cheaply etc clever data structures. But
lexing and parsing the tree is reading text and allocating and
generating the internal representation, and it's just "work". Lots of
it.

And that is why trying to avoid unnecessary header includes matters so
much. Because the front-end really does matter for compiler
performance.

(And it's at least partly why C++ is such a pain to compile, and why
C++ people want pre-compiled headers etc. You can't just do a forward
declaration of a struct type, and you get header inclusion from hell
when you have "clever" classes and inheritance etc. C++ build times
tend to be really nasty as a result).

                  Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ