linux-kernel - Re: Follow-up on Linux-kernel code accessibility

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f3f69a07-e742-47ba-92c6-0e4d303b1a20@lucifer.local>
Date: Tue, 6 Jan 2026 18:05:28 +0000
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: "Paul E. McKenney" <paulmck@...nel.org>
Cc: Gabriele Paoloni <gpaoloni@...hat.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Kate Stewart <kstewart@...uxfoundation.org>,
        Chuck Wolber <chuckwolber@...il.com>,
        "Julia.Lawall@...ia.fr" <Julia.Lawall@...ia.fr>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        Mark Rutland <mark.rutland@....com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Shuah Khan <skhan@...uxfoundation.org>, linux-kernel@...r.kernel.org
Subject: Re: Follow-up on Linux-kernel code accessibility

Sorry been on leave!

Sorry to fork the thread but going to take me a while to catch up with it.

On Thu, Dec 18, 2025 at 11:49:21AM -0800, Paul E. McKenney wrote:
> Hello!
>
> Just following up on some Linux Plumbers Conference discussions on the
> accessibility of Linux-kernel code to people ranging from novices to
> the developers and maintainers of the code in question.  I am adding
> Lorenze on CC not because he was involved with these discussions (at
> least as far as I know), but rather because I am using some of his work
> in my follow-up analysis.
>
> The Linux kernel's mm system weighs in at about 200KLoC, and Lorenzo
> wrote a book on its design that weighs in at about 1300 pages, or
> about 150 LoC/page.  This suggests that the Linux-kernel scheduler,
> which weighs in at about 70KLoC and has similar heuristics/workload
> challenges as does mm, would require a 430-page textbook to provide a
> similar level of design detail.  By this methodology, RCU would require
> "only" 190 pages, presumably substituting its unfamiliarity for sched's
> and mm's deeply heuristic and workload-dependent nature.

Well - keep in mind my book explicitly and intentionally excludes a _great
deal_ of topics (simply because I didn't have the time or capacity to cover
more), and even when exploring the code, I made liberal use of 'X is out of
scope' here to a. make it readable without being distracted constantly, and
b. again for time/capacity reasons.

And of course, I focused on only one architecture for anything
arch-specific (x86-64) with similar excuses^Wreasoning so there's that as a
multiplier too.

Overall I suspect what I cover is really only 10% of mm, as well or not
otherwise as I did.

So I'd x10 the LoC there ;)

>
> Sadly, this data does not support the hypothesis that we can create
> comments that will provide understanding to people taking random dives
> into the Linux kernel's source code.  In contrast to code that is closely
> associated with a specific type of mechanical device, Linux-kernel
> code requires the reader to possess a great deal of abstract and global
> conceptual/workload information.
>
> This is not to say that the Linux kernel's internal documentation
> (including its comments) cannot or should not be improved.
> They clearly should.  It instead means that a necessary part of any
> instant-understanding methodology for the Linux kernel include active
> software assistance, for example, Anthropic's Claude LLM or IBM's (rather
> older but less readily accessible) Analysis and Renovation Catalyst (ARC).
> I am not denigrating other options, but rather restricting myself to
> tools with which I have personal experience.

In my view AI is useful in the hands of an expert who can determine when it
tells the truth or not.

So you have a catch-22 there that's unresolvable by such tooling in my
opinion, and developers relying on that from the start are likely to not
have the right mental muscles exercised in my opinion.

I think there's definitely a place for AI, but I feel like this is not
it. And I think we'd do people a disservice by suggesting it.

A big idea in my book is to get people familiar with the concepts and the
code presented together so they can end up reading the code and
understanding it on the basis of the book having tied the two together and
shown 'hey it's not so bad you can extract meaning from this!'

That way, they can take the inevitably out of date contents and update to
the latest kernel with the skills developed (and of course many of the
concepts will remain valid).

>
> And one reason for continued but reasonable emphasis on internal
> documentation, including comments, is that the aforementioned tools
> ingest that documentation.  ;-)
>
> Thoughts?
>
> And in the meantime, happy holidays for those celebrating them!
>
> 							Thanx, Paul

Cheers, Lorenzo