linux-kernel - Re: [PATCH] Documentation: Add "how to write a good patch summary" to SubmittingPatches

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20090416220839.GA30920@elte.hu>
Date:	Fri, 17 Apr 2009 00:08:39 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Theodore Tso <tytso@....edu>, "H. Peter Anvin" <hpa@...or.com>,
	Linux Kernel Developers List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] Documentation: Add "how to write a good patch summary"
	to SubmittingPatches

* Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> On Thu, 16 Apr 2009, Theodore Tso wrote:
> 
> > On Thu, Apr 16, 2009 at 10:12:55PM +0200, Ingo Molnar wrote:
> > > as a bug triager i can, within 1 minute, sort all the commits by 
> > > risk:
> > > 
> > > Low risk cleanups:
> > >     ...
> > > Runtime crash fixes:
> > >     ...
> > > Robustness enhancements:
> > >     ...
> > > Low-risk features:
> > >     ...
> > > High-risk features:
> > >     ...
> > 
> > Sure, but if that's the goal, maybe instead we should have some
> > keywords that we tag onto one-line summary, i.e.
> > 
> > ext4 <LR,cleanup>: 
> 
> Hell no.

I find those artificial tags pretty ugly too.

> The fact is, those "low risk cleanups" break things.
> 
> People who think that you can assess the risk of a commit 
> before-hand and then rely on it are clueless morons.

That's why it's _hard_ to write good impact lines - it takes quite a 
bit of effort to assess the _expected_ impact of commits reliably 
and not look like a complete fool a few days, weeks or months down 
the road.

Those mistakes are also _useful_ for that exact reason: they tell us 
the exact pattern of mis-judged impact and act as a feedback cycle. 
We learned to be a _lot_ more careful about certain areas of code by 
looking at the impact lines of commits that turned out to be broken.

And if the impact cannot be assessed reliably by looking at the 
patch? Then i ask contributors to split up the patch into smaller 
steps.

And the thing is, commit logs themselves - as you can see it in the 
18 specific examples i analyzed above - can be _far more_ ambiguous 
about the true impact of a change - and you are fooling yourself if 
you dont admit to this very basic, simple daily fact of Linux kernel 
commit logs.

Also, natural language commit logs tend to be not too 
straightforward about impact because there's a basic inner 
(sub-conscious) drive in most developers to play down the impact of 
some really embarrassing brown paper bag bug, or to not think too 
hard about the risks of a new feature.

Impact lines _remove_ this fear and associated guilt factor. It 
makes the production of commit logs _more positive_, because it's an 
unavoidable hard rule to admit to crap and mistakes in a neutral, 
unemotional way. And if everyone does it consistently, it looks a 
lot less embarrasing.

The basic problem is that natural languages are one big babble 
machine stock full of inner pardoxes and contradictions: they are 
too vaguely defined, ambiguous, they are emotion laden and 
over-verbose - giving fertile ground for whitewashing and obscuring 
information - or just covering information in white noise.

A good commit log will tell us a nice story and gently and gradually 
drives us along the pathway of the developer's thought process. But 
in the overwhelming majority of cases it will not drive us along the 
more embarrassing bits: how stupid a bug it fixes, how severe that 
bug is, or how risky a new feature is.

_LOOK_ at the 18 commit logs i spent an hour to analyze. That is our 
reality - those are the top-notch commits we have - out of the best 
of the best 5%.

_ADMIT_ that this basic equation is not going to change 
significantly. There's small steps of progress, but our commit logs 
sucked 5 and 10 years ago too, and they sucked for very fundamental 
reasons.

The ext4 logs were of exceptional quality - and we even saw one 
clear 'brown paperbag' bug admitted to there, frankly and openly. 
But it's the exception, and still, the impact lines i added _clearly 
improved the end result_.

The impact line forces honesty without actually accusing people of 
trying unintentionally to mislead others. It also prevents people 
from sub-consciously _fooling themselves_.

Will there be mistakes? Sure, and managing mistakes is the _point_ 
of risk analysis so why would we want to claim that the risk 
assessment is perfect? There are mistakes everywhere in the kernel, 
and the only way to tackle them is to have a clear idea about them. 
Human mistakes fundamentally affect the quality-mapping system too - 
and analyzing that is an important part of quality analysis.

Can they be relied on? They can be relied on the same way all 
written down words can be relied on that accompany code: it depends 
on how much i trust the person who wrote it and it depends on the 
actual track record of that code. So it can be 99% trust or a very 
low level of trust.

In the end, only reading the code will tell for sure. Sometimes not 
even that.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/