lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160312161804.GA12688@gmail.com>
Date:	Sat, 12 Mar 2016 17:18:04 +0100
From:	Ingo Molnar <mingo@...nel.org>
To:	Toshi Kani <toshi.kani@....com>
Cc:	"bp@...e.de" <bp@...e.de>, "hpa@...or.com" <hpa@...or.com>,
	"tglx@...utronix.de" <tglx@...utronix.de>,
	"mcgrof@...e.com" <mcgrof@...e.com>,
	"jgross@...e.com" <jgross@...e.com>,
	"paul.gortmaker@...driver.com" <paul.gortmaker@...driver.com>,
	"x86@...nel.org" <x86@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/2] x86/mtrr: Refactor PAT initialization code


* Toshi Kani <toshi.kani@....com> wrote:

> On Fri, 2016-03-11 at 09:13 +0000, Ingo Molnar wrote:
> > * Ingo Molnar <mingo@...nel.org> wrote:
> > 
> > > 
> > > * Toshi Kani <toshi.kani@....com> wrote:
> > > 
> > > > MTRR manages PAT initialization as it implements a rendezvous
> > > > handler that initializes PAT as part of MTRR initialization.
> > > > 
> > > > When CPU does not support MTRR, ex. qemu32 virtual CPU, MTRR
> > > > simply skips PAT init, which causes PAT left enabled without
> > > > initialization. [...]
> > > 
> > > What practical effects does this have to the user? Does the kernel
> > > crash?
> > 
> > Btw., I find this omission _highly_ annoying: describing what negative
> > effects a bug _causes in practice_ is the most important part of a
> > changelog. How on earth can an experienced contributor omit such an
> > important component from a patch description?
> > 
> > Most readers of changelogs couldn't care less about technical details of
> > how the bug is fixed (of course others will read it so it's nice to have
> > too), but what symptoms a bug causes, how serious is it, whether it
> > should be backported are like super important compared to everything else
> > you wrote - and both the description and the changelogs are totally
> > silent on those topics ...
> > 
> > I've seen this in other PAT patches - please try to improve this.
> 
> My apology. I agree the importance of describing the negative effect of the
> issue. This case is complicated to describe thoroughly, but here is a
> summary.

The new changelog looks very good, thanks!

> The issue was reported as a regression caused by 'commit 9cd25aac1f44
> ("x86/mm/pat: Emulate PAT when it is disabled")'. So, the goal of this
> patchset is to fix this regression.
> https://lkml.org/lkml/2016/3/3/828

So one thing that matters more than anything else in the changelog, the title! 
Right now the title is:

  x86/mtrr: Refactor PAT initialization code

... that's a nice title for a true refactoring of the code, but this isn't really 
that, the purpose of this fix is to fix a bad Xorg crash for Qemu users.

The principle you need to remember is that readers of your changelogs will be 
_very happy_ about 'negative' phrases like:

  bad bug
  Xorg crash
  boot failure
  kernel crash
  NULL dereference

I.e. the 'best' title for a bug fix is to characterize it in the most negative 
truthful fashion in the changelog. It sounds a bit counterintuitive but it's true. 

So in this case the best changelog title would be something like:

  x86/pat: Fix Xorg crashes in Qemu sessions

People will absolutely _love_ such titles, because:

  - users who are trying to find mysterious Xorg failures can grep for it and 
    might find it before it hits a stable kernel they are using

  - maintainers (like me) are able to see it at a glance that this fix should go 
    to Linus more urgently than other fixes. (and definitely more urgently than 
    feature patches.)

  - stable kernel maintainers and distro backporters can see it immediately at a 
    glance that they really want this fix.

So by being intentionally and maximally negative in the title, you are being very 
helpful to your fellow developers and users!

Now consider the original title:

  x86/mtrr: Refactor PAT initialization code

99% of people will glance over such a title, which is not good. Furhermore, 
maintainers like me will get _annoyed_ at such titles, because this neutrally 
formulated title, while very polite, actively hides the important detail that 
these patches fix real negative bugs for real users.

Okay?

And please also note that in the Linux kernel no-one ever 'blames' other people 
for bugs. Bugs are part of the human condition and they happen all the time as 
long as they are not introduced by carelessness. So in the typical case you cannot 
possibly socially embarrass any good kernel developer by reporting and fixing a 
bug he introduced. The typical reaction you will get is 'oh great, one bug less to 
worry about!', so socially you can be absolutely honest and 'impolite' about the 
negative effects of bugs.

> The negative effects of the issue were two failures in Xorg on qemu32 env,
> which was triggered by the fact that its virtual CPU does not support MTRR.
> https://lkml.org/lkml/2016/3/4/775
>  #1. copy_process() failed in the check in reserve_pfn_range()
>  #2. error path in copy_process() then hit WARN_ON_ONCE in untrack_pfn().

Yeah, it's nice to quote actual crash signatures as well (in a short form) - 
because people hitting the crashes often do a google search and might find the fix 
based on such patterns.

Thanks!

	Ingo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ