linux-kernel - Re: [ANNOUNCE] "Fast Kernel Headers" Tree -v2

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YegErRbP+cT42oOC@gmail.com>
Date:   Wed, 19 Jan 2022 13:31:41 +0100
From:   Ingo Molnar <mingo@...nel.org>
To:     Arnd Bergmann <arnd@...db.de>
Cc:     Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-arch <linux-arch@...r.kernel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Nathan Chancellor <nathan@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Ard Biesheuvel <ardb@...nel.org>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Jonathan Corbet <corbet@....net>
Subject: Re: [ANNOUNCE] "Fast Kernel Headers" Tree -v2


* Arnd Bergmann <arnd@...db.de> wrote:

> > I tried to avoid as many low level headers as possible from the main 
> > types headers - and the get_order() functionality also brings in bitops 
> > definitions, which I'm still hoping to be able to reduce from its 
> > current ~95% utilization in a distro kernel ...
> 
> Agreed, I think reducing bitops.h and atomic.h usage is fairly important, 
> I think these are even bigger on arm64 than on x86.

So what I'm using for 'header complexity metrics' is rather simple: passing 
-P -H to the preprocessor: stripping comments & not generating 
line-markers, and then counting linecount.

Line-markers should *probably* remain, because the real build is generating 
them too - but I wanted to gain a crude & easily available metric to 
measure 'first-pass parsing complexity'. That's I think where most of the 
header bloat is concentrated: later passes don't really get any of the 
unused header definitions passed along. (But maybe this is an invalid 
assumption, because compiler warnings do get generated by later passes, and 
they are generated for mostly-unused header inlines too.)

If we include comments & line-markers then the bloat goes up by another 
~2x:

 kepler:~/mingo.tip.git> ./st include/linux/sched.h 
  #include <linux/sched.h>                | LOC:  2,186 | headers:  118
 kepler:~/mingo.tip.git> ./st include/linux/sched.h 
  #include <linux/sched.h>                | LOC:  4,092 | headers:    0


> > We could add <linux/page_api.h> as well, as a standardized header. We 
> > already have page_types.h and et_order() is a page types API.
> 
> More generally speaking, do you have a plan for how to document which 
> header to include for getting a particular symbol that is provided by a 
> header we don't want to include directly? I think iwyu has a particular 
> notation for it, but when I looked at using that in 2020 I decided it 
> wouldn't scale to the size of the kernel. I did my own shell script with 
> a long list of regex patterns, but I'm not convinced about that approach 
> either.

Yeah, I don't think we should do much that hurts general usability of 
headers: each symbol has a primary "natural" header, and .c code and other 
headers are encouraged but not strictly required to include that.

Thanks,

	Ingo