linux-kernel - Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <261D65D8-7273-4884-BD01-2BF8331F4034@fb.com>
Date:   Thu, 16 Sep 2021 20:38:13 +0000
From:   Chris Mason <clm@...com>
To:     James Bottomley <James.Bottomley@...senPartnership.com>
CC:     Theodore Ts'o <tytso@....edu>,
        Johannes Weiner <hannes@...xchg.org>,
        "Kent Overstreet" <kent.overstreet@...il.com>,
        Matthew Wilcox <willy@...radead.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "Darrick J. Wong" <djwong@...nel.org>,
        "Christoph Hellwig" <hch@...radead.org>,
        David Howells <dhowells@...hat.com>,
        "ksummit@...ts.linux.dev" <ksummit@...ts.linux.dev>
Subject: Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers
 Summit topic?



> On Sep 16, 2021, at 1:11 PM, James Bottomley <James.Bottomley@...senPartnership.com> wrote:
> 
> On Thu, 2021-09-16 at 16:46 +0000, Chris Mason wrote:
>> 
>> With folios, we don't have general consensus on:
>> 
>> * Which problems are being solved?  Kent's writeup makes it pretty
>> clear filesystems and memory management developers have diverging
>> opinions on this.  Our process in general is to put this into patch
>> 0.  It mostly works, but there's an intermediate step between patch 0
>> and the full lwn article that would be really nice to have.
> 
> I agree here ... but problem definition is supposed to be the job of
> the submitter and fully laid out in the cover letter.
> 
>> * Who is responsible for accepting the design, and which acks must be
>> obtained before it goes upstream?  Our process here is pretty similar
>> to waiting for answers to messages in bottles.  We consistently leave
>> it implicit and poorly defined.
> 
> My answer to this would be the same list of people who'd be responsible
> for ack'ing the patches.  However, we're always very reluctant to ack
> designs in case people don't like the look of the code when it appears
> and don't want to be bound by the ack on the design.  I think we can
> get around this by making it clear that design acks are equivalent to
> "This sounds OK but I won't know for definite until I see the code"
> 
>> * What work is left before it can go upstream?  Our process could be
>> effectively modeled by postit notes on one person's monitor, which
>> they may or may not share with the group.  Also, since we don't have
>> agreement on which acks are required, there's no way to have any
>> certainty about what work is left.  It leaves authors feeling
>> derailed when discussion shifts and reviewers feeling frustrated and
>> ignored.
> 
> Actually, I don't see who should ack being an unknown.  The MAINTAINERS
> file covers most of the kernel and a set of scripts will tell you based
> on your code who the maintainers are ... that would seem to be the
> definitive ack list.

One risk with this thread is over-pivoting on folios.  It’s a great example exactly because Willy is so well established.  If the definitive ack list is easy, how do we consistently seem to mess it up?

Part of the problem is that we just leave it unsaid.  Andrew has a list in his head of acks he’s waiting for, and Willy has a slightly different list, and Linus again has a slightly different list.  

> 
> I think the problem is the ack list for features covering large areas
> is large and the problems come when the acker's don't agree ... some
> like it, some don't.  The only deadlock breaking mechanism we have for
> this is either Linus yelling at everyone or something happening to get
> everyone into alignment (like an MM summit meeting).  Our current model
> seems to be every acker has a foot on the brake, which means a single
> nack can derail the process.  It gets even worse if you get a couple of
> nacks each requesting mutually conflicting things.

Agree here.  Mailing lists make it really hard to figure out when these conflicts are resolved, which is why I love using google docs for that part.

> 
> We also have this other problem of subsystems not being entirely
> collaborative.  If one subsystem really likes it and another doesn't,
> there's a fear in the maintainers of simply being overridden by the
> pull request going through the liking subsystem's tree.  This could be
> seen as a deadlock breaking mechanism, but fear of this happening
> drives overreactions.

I do agree, but I think this part we actually get right more often than not.  It’s one of those places where you usually see Linus using his powers for good.

> 
> We could definitely do a clear definition of who is allowed to nack and
> when can that be overridden.
> 
>> * How do we divide up the long term future direction into individual
>> steps that we can merge?  This also goes back to consensus on the
>> design.  We can't decide which parts are going to get layered in
>> future merge windows until we know if we're building a car or a
>> banana stand.
> 
> This is usual for all large patches, though, and the author gets to
> design this.

Ex: patches tripping over unrelated but useful cleanups that don’t actually have to happen first but end up requirements for inclusion.  The examples matter less than a way to document agreement on requirements for inclusion.

> 
>> * What tests will we use to validate it all?  Work this spread out is
>> too big for one developer to test alone.  We need ways for people
>> sign up and agree on which tests/benchmarks provide meaningful
>> results.
> 
> In most large patches I've worked on, the maintainers raise worry about
> various areas (usually performance) and the author gets to design tests
> to validate or invalidate the concern ... which can become very open
> ended if the concern is vague.
> 
>> The end result of all of this is that missing a merge window isn't
>> just about a time delay.  You add N months of total uncertainty,
>> where every new email could result in having to start over from
>> scratch.  Willy's do-whatever-the-fuck-you-want-I'm-going-on-vacation 
>> email is probably the least surprising part of the whole thread.
>> 
>> Internally, we tend to use a simple shared document to nail all of
>> this down.  A two page google doc for folios could probably have
>> avoided a lot of pain here, especially if we’re able to agree on
>> stakeholders.
> 
> You mean like a cover letter?  Or do you mean a living document that
> the acker's could comment on and amend?

A living document with a single source of truth on key design points, work remaining, and stakeholders who are responsible for ack/nack decisions.  Basically if you don’t have edit permissions on the document, you’re not one of the people that can say no.

If you do have edit permissions, you’re expected to be on board with the overall goal and help work through the design/validation/code/etc until you’re ready to ack it, or until it’s clear the whole thing isn’t going to work.  If you feel you need to have edit permissions, you’ve got a defined set of people to talk with about it.

It can’t completely replace the mailing lists, but it can take a lot of the archeology out of understanding a given patch series and figuring out if it’s actually ready to go.

-chris