[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170924091625.qgudix2viua2cpkm@gmail.com>
Date: Sun, 24 Sep 2017 11:16:25 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Jean Delvare <jdelvare@...e.de>
Cc: LKML <linux-kernel@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
"H. Peter Anvin" <hpa@...or.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] firmware: dmi_scan: Drop dmi_initialized
* Jean Delvare <jdelvare@...e.de> wrote:
> Hi Ingo,
>
> On Sat, 23 Sep 2017 12:50:31 +0200, Ingo Molnar wrote:
> > * Jean Delvare <jdelvare@...e.de> wrote:
> >
> > > I don't think it makes sense to check for a possible bad
> > > initialization order at run time on every system when it is all
> > > decided at build time.
> > >
> > > A more efficient way to make sure developers do not introduce new
> > > calls to dmi_check_system() too early in the initialization sequence
> > > is to simply document the expected call order. That way, developers
> > > have a chance to get it right immediately, without having to
> > > test-boot their kernel, wonder why it does not work, and parse the
> > > kernel logs for a warning message. And we get rid of the run-time
> > > performance penalty as a nice side effect.
> >
> > Huh? Initialization ordering requirements are very opaque,
>
> They were. Now they are very documented.
>
> > and by removing the debug check any such bugs are actively hidden. How
> > is documentation supposed to uncover such bugs once they happen?
>
> You are looking at it the wrong way around. Documentation is how they
> do not happen in the first place.
That expectation, as a general statement, is very naive and contrary to
experience: documentation is fine for one layer of defense to prevent bugs, but
_when_ they happen and a bug slips through, documentation does not help anymore,
because the dependencies in the _code_ are opaque and non-obvious ...
For example during the early SMP efforts of Linux we used to document lock
dependencies as well, but once the kernel had more than a dozen spinlocks we
periodically ran into deadlocks and the whole design became unmaintainable
quickly. So we have lockdep in addition to documentation.
> You hit this problem once, 9 years ago. You thought it would have been easier to
> debug if there was a warning, and you added it.
I did not just 'think' it would have been easier to debug, I wasted time on that
bug and a warning would have helped so I added it. That was and remains
objectively true.
While I expect most such warnings to never see any public email lists (because
once a developer triggers it it gets fixed without the bug ever getting triggered
by others), yet searching for "dmi check: not initialized yet" still finds a
couple of incidents where real or potential bugs were found by this init
dependency check, such as:
http://lists.infradead.org/pipermail/linux-arm-kernel/2014-September/289347.html
or this:
https://www.spinics.net/lists/linux-acpi/msg28698.html
... so this warning actually helped a number of kernel developers to not
waste time on the opaque dependency. This is a warning that was added due
to an _actual category of bugs_, which has been triggered subsequently as
well, so it's not a frivolous warning by any meaning.
> [...] It was one way to solve the problem but I claim it was not the best.
>
> What I expect from developers calling a function they aren't familiar
> with is to read its documentation first. That's the very reason why we
> spend time writing the documentation. They should not just call the
> function, boot and see if it works or not. Software engineering vs.
> trial and error.
This statement is breathtaking in its ignorance :-(
> > So NAK.
>
> This was FYI. I maintain this subsystem, and you did not convince me. I also
> can't see a general trend of implementing what you suggest in the rest of the
> kernel. Thankfully.
I find the arrogance displayed here breathtaking as well - the last thing we need
is for firmware interfacing kernel code to become _more_ fragile.
This was and continues to be a useful warning - but what worries me even more is
not just the removal of the warning, but the false and technically invalid
justifications under which it is removed...
For those reasons I maintain my NAK:
Nacked-by: Ingo Molnar <mingo@...nel.org>
Thanks,
Ingo
Powered by blists - more mailing lists