linux-kernel - Follow-up on Linux-kernel code accessibility

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <90d56d30-232d-4930-ad9f-5aebade7cdf2@paulmck-laptop>
Date: Thu, 18 Dec 2025 11:49:21 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Gabriele Paoloni <gpaoloni@...hat.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Kate Stewart <kstewart@...uxfoundation.org>,
	Chuck Wolber <chuckwolber@...il.com>,
	"Julia.Lawall@...ia.fr" <Julia.Lawall@...ia.fr>,
	Dmitry Vyukov <dvyukov@...gle.com>,
	Mark Rutland <mark.rutland@....com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
	Shuah Khan <skhan@...uxfoundation.org>
Cc: linux-kernel@...r.kernel.org
Subject: Follow-up on Linux-kernel code accessibility

Hello!

Just following up on some Linux Plumbers Conference discussions on the
accessibility of Linux-kernel code to people ranging from novices to
the developers and maintainers of the code in question.  I am adding
Lorenze on CC not because he was involved with these discussions (at
least as far as I know), but rather because I am using some of his work
in my follow-up analysis.

The Linux kernel's mm system weighs in at about 200KLoC, and Lorenzo
wrote a book on its design that weighs in at about 1300 pages, or
about 150 LoC/page.  This suggests that the Linux-kernel scheduler,
which weighs in at about 70KLoC and has similar heuristics/workload
challenges as does mm, would require a 430-page textbook to provide a
similar level of design detail.  By this methodology, RCU would require
"only" 190 pages, presumably substituting its unfamiliarity for sched's
and mm's deeply heuristic and workload-dependent nature.

Sadly, this data does not support the hypothesis that we can create
comments that will provide understanding to people taking random dives
into the Linux kernel's source code.  In contrast to code that is closely
associated with a specific type of mechanical device, Linux-kernel
code requires the reader to possess a great deal of abstract and global
conceptual/workload information.

This is not to say that the Linux kernel's internal documentation
(including its comments) cannot or should not be improved.
They clearly should.  It instead means that a necessary part of any
instant-understanding methodology for the Linux kernel include active
software assistance, for example, Anthropic's Claude LLM or IBM's (rather
older but less readily accessible) Analysis and Renovation Catalyst (ARC).
I am not denigrating other options, but rather restricting myself to
tools with which I have personal experience.

And one reason for continued but reasonable emphasis on internal
documentation, including comments, is that the aforementioned tools
ingest that documentation.  ;-)

Thoughts?

And in the meantime, happy holidays for those celebrating them!

							Thanx, Paul