linux-kernel - Re: [PATCH v4 00/15] Fast kernel headers: split linux/mm.h

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f959c4c4-8118-436c-83fd-d299689f753a@redhat.com>
Date: Tue, 12 Mar 2024 20:26:08 +0100
From: David Hildenbrand <david@...hat.com>
To: Max Kellermann <max.kellermann@...os.com>
Cc: akpm@...ux-foundation.org, linux-mm@...ck.org,
 linux-kernel@...r.kernel.org, willy@...radead.org, sfr@...b.auug.org.au
Subject: Re: [PATCH v4 00/15] Fast kernel headers: split linux/mm.h

On 12.03.24 19:11, Max Kellermann wrote:
> On Tue, Mar 12, 2024 at 5:33 PM David Hildenbrand <david@...hat.com> wrote:
>> Just curious: why? Usually build time, do you have some numbers?
> 
> (This has been discussed so often and I thought having smaller header
> dependencies was already established as general consensus among kernel
> devs, and everybody agreed that mm.h and kernel.h are a big mess that
> needs cleanup.)

It's always a good idea to include at least some sentences about 
history+motivation in the cover letter.

> 
> Why: Correctness, less namespace pollution, and the least important
> aspect is reduced build times. However, the gains by this patch set
> are very small; each optimization gains very little.
> 
> Some time ago, I posted a much larger patch set with a few numbers:
> https://lore.kernel.org/lkml/20240131145008.1345531-1-max.kellermann@ionos.com/
> - the speedup was measurable, but not amazing, because even after that
> patch set, everybody still includes everything, and much more cleanup
> is needed to make a bigger difference. Once the big knots like
> kernel.h and mm.h are broken up, we will have better results. And as I
> said: build times are nice, but the lesser advantage.

Okay, so "Fast kernel headers:" is a bit misleading :P

I'm asking these stupid questions because I remember some header splitup 
(from Christoph?) that really did make a significant build-time difference.

Reading all that, I would have written something like this in the cover 
letter:

"There is an agreement that we want to split up some of the big kernel 
headers, such as kernel.h and mm.h. While not delivering immediate 
build-time gains, it's one step into the desired direction. Besides 
build-time, smaller headers promise less namespace pollution and 
correctness."

(although I don't immediately understand what correctness means in this 
context)

> 
> Anyway, this first patch set was so extremely large that nobody was
> able to review it. So this patch contains just the parts that deal
> with mm.h; later, when this patch set is merged, I can continue with
> other headers. (I already have a branch that splits kernel.h and I'll
> submit it eventually, after the mm.h cleanup.)

I'd have continued with

"Some of these changes were part of a previous, bigger patchset [1]. I'm 
now sending out smaller, reviewable pieces. Splitting up mm.h is the 
first step."

[1] 
https://lore.kernel.org/lkml/20240131145008.1345531-1-max.kellermann@ionos.com/


> 
>> I'm not against splitting out stuff. But one function per header is a
>> bit excessive IMHO.
> 
> One function per header is certainly not my goal and I agree it's
> excessive; but folio_next() in its own header made sense, just in this
> special case, because it allowed removing the mm.h dependency from
> bio.h. Removing this dependency was the goal, and folio_next.h was the
> solution for this particular problem.

Okay, I see. I still wonder if merging some of the new folio headers in 
"folio.h" wouldn't have achieved something close to that. Sure, they 
bring in some other dependencies, but hopefully not mm.h

Anyhow, what you are describing here would also have been reasonable to 
describe somewhere in few words. IOW, which approach did you chose to 
decide when a new header was reasonable.

> 
>> Ideally, we'd have some MM guideline that we'll be
>> able to follow in the future. So we don't need "personal taste".
> 
> Agree. But lacking such a guideline, all I can do is make suggestions
> and submit patches for review, trying to follow what seemed to be
> consensus in previous cleanups and what had previously been merged.

Possibly it makes sense to have some rough guideline. The problem I see 
is that once your stuff is merged, it will start getting messed up again 
over time. But maybe that's just the way it works.

> 
>> For example, if I were to write a folio_prev(), should I put it in
>> include/linux/mm/folio_prev.h ? Likely we'd put it into folio_next.h,
>> but then the name doesn't make any sense.
> 
> True. But since no folio_prev() function exists, the only name that
> made sense for this header was folio_next.h. If folio_prev() gets
> added, I'd say put it in that header, but rename it to, let's say,
> "folio_iterator.h". But right now, with just this one function (and
> nothing else like it that could be added here), I decided to suggest
> naming it "folio_next.h". If you think "folio_iterator.h" or something
> else is better, I'll rename and resubmit; names don't matter so much
> to me; the general idea of cleaning up the headers is what's
> important.
> 
>> The point I am trying to make: if there was a single folio_ops.h, it
>> would be clearer where it would go.
> 
> Maybe, but IMO this wouldn't be a good place to add folio_next(),
> because folio_next() doesn't "operate" on a folio, but on a folio
> pointer (or "iterator"). Again, names don't matter to me -
> ideas/concepts matter. I'm only explaining why I decided to submit my
> patches that way, but I'll change them any way you people want.

Thanks for the details.

-- 
Cheers,

David / dhildenb