linux-kernel - Re: Follow-up on Linux-kernel code accessibility

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <636d1798-3b37-293a-51b2-55d2ecc6d2d@inria.fr>
Date: Fri, 19 Dec 2025 07:51:47 +0100 (CET)
From: Julia Lawall <julia.lawall@...ia.fr>
To: "Paul E. McKenney" <paulmck@...nel.org>
cc: Gabriele Paoloni <gpaoloni@...hat.com>, 
    Steven Rostedt <rostedt@...dmis.org>, 
    Kate Stewart <kstewart@...uxfoundation.org>, 
    Chuck Wolber <chuckwolber@...il.com>, 
    "Julia.Lawall@...ia.fr" <Julia.Lawall@...ia.fr>, 
    Dmitry Vyukov <dvyukov@...gle.com>, Mark Rutland <mark.rutland@....com>, 
    Thomas Gleixner <tglx@...utronix.de>, 
    Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, 
    Shuah Khan <skhan@...uxfoundation.org>, linux-kernel@...r.kernel.org
Subject: Re: Follow-up on Linux-kernel code accessibility



On Thu, 18 Dec 2025, Paul E. McKenney wrote:

> Hello!
>
> Just following up on some Linux Plumbers Conference discussions on the
> accessibility of Linux-kernel code to people ranging from novices to
> the developers and maintainers of the code in question.  I am adding
> Lorenze on CC not because he was involved with these discussions (at
> least as far as I know), but rather because I am using some of his work
> in my follow-up analysis.
>
> The Linux kernel's mm system weighs in at about 200KLoC, and Lorenzo
> wrote a book on its design that weighs in at about 1300 pages, or
> about 150 LoC/page.  This suggests that the Linux-kernel scheduler,
> which weighs in at about 70KLoC and has similar heuristics/workload
> challenges as does mm, would require a 430-page textbook to provide a
> similar level of design detail.  By this methodology, RCU would require
> "only" 190 pages, presumably substituting its unfamiliarity for sched's
> and mm's deeply heuristic and workload-dependent nature.
>
> Sadly, this data does not support the hypothesis that we can create
> comments that will provide understanding to people taking random dives
> into the Linux kernel's source code.  In contrast to code that is closely
> associated with a specific type of mechanical device, Linux-kernel
> code requires the reader to possess a great deal of abstract and global
> conceptual/workload information.
>
> This is not to say that the Linux kernel's internal documentation
> (including its comments) cannot or should not be improved.
> They clearly should.  It instead means that a necessary part of any
> instant-understanding methodology for the Linux kernel include active
> software assistance, for example, Anthropic's Claude LLM or IBM's (rather
> older but less readily accessible) Analysis and Renovation Catalyst (ARC).
> I am not denigrating other options, but rather restricting myself to
> tools with which I have personal experience.
>
> And one reason for continued but reasonable emphasis on internal
> documentation, including comments, is that the aforementioned tools
> ingest that documentation.  ;-)

Maybe we're not looking for an instant understanding methodology.  Rather
a machine checkable way to document the invariants that exist in the head
of the developer, and for some bounded amount of time in the head of the
person who has tried to reconstruct them.

There are different levels of specifications that one can write.  In Japan
Imentioned how for the enable-disable ftrace function, I enumerated all of
the permutations of the if tests, resulting in hundreds of lines of
specifications, but after two failed attempts, the third attempt yielded
both valid specifications and some insight into what the function was
doing.  This insight could potentially be used to make some higher level
specifications that would be even more concise than the current
English-language ones.  Maybe the low-level ones could be made
automatically in many cases, or regenerated automatically from some hints.
But the low-level ones may be needed to make the bridge between the code
and the high-level specification.

julia


>
> Thoughts?
>
> And in the meantime, happy holidays for those celebrating them!
>
> 							Thanx, Paul
>