lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK8P3a2yVXw9gf8-BNvX_rzectNoiy0MqGKvBcXydiUSrc_fCA@mail.gmail.com>
Date:   Fri, 19 Nov 2021 16:57:46 +0100
From:   Arnd Bergmann <arnd@...db.de>
To:     "Alejandro Colomar (man-pages)" <alx.manpages@...il.com>
Cc:     Arnd Bergmann <arnd@...db.de>, LKML <linux-kernel@...r.kernel.org>,
        Ajit Khaparde <ajit.khaparde@...adcom.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Bjorn Andersson <bjorn.andersson@...aro.org>,
        Borislav Petkov <bp@...e.de>,
        Corey Minyard <cminyard@...sta.com>, Chris Mason <clm@...com>,
        Christian Brauner <christian.brauner@...ntu.com>,
        David Sterba <dsterba@...e.com>,
        Jani Nikula <jani.nikula@...ux.intel.com>,
        Jason Wang <jasowang@...hat.com>,
        Jitendra Bhivare <jitendra.bhivare@...adcom.com>,
        John Hubbard <jhubbard@...dia.com>,
        "John S . Gruber" <JohnSGruber@...il.com>,
        Jonathan Cameron <Jonathan.Cameron@...wei.com>,
        Joonas Lahtinen <joonas.lahtinen@...ux.intel.com>,
        Josef Bacik <josef@...icpanda.com>,
        Kees Cook <keescook@...omium.org>,
        Ketan Mukadam <ketan.mukadam@...adcom.com>,
        Len Brown <lenb@...nel.org>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        Miguel Ojeda <ojeda@...nel.org>,
        Mike Rapoport <rppt@...ux.ibm.com>,
        Nick Desaulniers <ndesaulniers@...gle.com>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Rasmus Villemoes <linux@...musvillemoes.dk>,
        Rodrigo Vivi <rodrigo.vivi@...el.com>,
        Russell King <linux@...linux.org.uk>,
        Somnath Kotur <somnath.kotur@...adcom.com>,
        Sriharsha Basavapatna <sriharsha.basavapatna@...adcom.com>,
        Subbu Seetharaman <subbu.seetharaman@...adcom.com>,
        Intel Graphics <intel-gfx@...ts.freedesktop.org>,
        ACPI Devel Maling List <linux-acpi@...r.kernel.org>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        linux-btrfs <linux-btrfs@...r.kernel.org>,
        linux-scsi <linux-scsi@...r.kernel.org>,
        Networking <netdev@...r.kernel.org>,
        "open list:DRM DRIVER FOR QEMU'S CIRRUS DEVICE" 
        <virtualization@...ts.linux-foundation.org>
Subject: Re: [PATCH 00/17] Add memberof(), split some headers, and slightly
 simplify code

On Fri, Nov 19, 2021 at 4:06 PM Alejandro Colomar (man-pages)
<alx.manpages@...il.com> wrote:
> On 11/19/21 15:47, Arnd Bergmann wrote:
> > On Fri, Nov 19, 2021 at 12:36 PM Alejandro Colomar
>
> Yes, I would like to untangle the dependencies.
>
> The main reason I started doing this splitting
> is because I wouldn't be able to include
> <linux/stddef.h> in some headers,
> because it pulled too much stuff that broke unrelated things.
>
> So that's why I started from there.
>
> I for example would like to get NULL in memberof()
> without puling anything else,
> so <linux/NULL.h> makes sense for that.
>
> It's clear that every .c wants NULL,
> but it's not so clear that every .c wants
> everything that <linux/stddef.h> pulls indirectly.

>From what I can tell, linux/stddef.h is tiny, I don't think it's really
worth optimizing this part. I have spent some time last year
trying to untangle some of the more interesting headers, but ended
up not completing this as there are some really hard problems
once you start getting to the interesting bits.

The approach I tried was roughly:

- For each header in the kernel, create a preprocessed version
  that includes all the indirect includes, from that start a set
  of lookup tables that record which header is eventually included
  by which ones, and the size of each preprocessed header in
  bytes

- For a given kernel configuration (e.g. defconfig or allmodconfig)
  that I'm most interested in, look at which files are built, and what
  the direct includes are in the source files.

- Sort the headers by the product of the number of direct includes
  and the preprocessed size: the largest ones are those that are
  worth looking at first.

- use graphviz to visualize the directed graph showing the includes
  between the top 100 headers in that list. You get something like
  I had in [1], or the version afterwards at [2].

- split out unneeded indirect includes from the headers in the center
  of that graph, typically by splitting out struct definitions.

- repeat.

The main problem with this approach is that as soon as you start
actually reducing the unneeded indirect includes, you end up with
countless .c files that no longer build because they are missing a
direct include for something that was always included somewhere
deep underneath, so I needed a second set of scripts to add
direct includes to every .c file.

On the plus side, I did see something on the order of a 30%
compile speed improvement with clang, which is insane
given that this only removed dead definitions.

> But I'll note that linux/fs.h, linux/sched.h, linux/mm.h are
> interesting headers for further splitting.
>
>
> BTW, I also have a longstanding doubt about
> how header files are organized in the kernel,
> and which headers can and cannot be included
> from which other files.
>
> For example I see that files in samples or scripts or tools,
> that redefine many things such as offsetof() or ARRAY_SIZE(),
> and I don't know if there's a good reason for that,
> or if I should simply remove all that stuff and
> include <linux/offsetof.h> everywhere I see offsetof() being used.

The main issue here is that user space code should not
include anything outside of include/uapi/ and arch/*/include/uapi/

offsetof() is defined in include/linux/stddef.h, so this is by
definition not accessible here. It appears that there is also
an include/uapi/linux/stddef.h that is really strange because
it includes linux/compiler_types.h, which in turn is outside
of uapi/. This should probably be fixed.

      Arnd

[1] https://drive.google.com/file/d/14IKifYDadg2W5fMsefxr4373jizo9bLl/view?usp=sharing
[2] https://drive.google.com/file/d/1pWQcv3_ZXGqZB8ogV-JOfoV-WJN2UNnd/view?usp=sharing

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ