linux-kernel - Re: [5/6] 2.6.21-rc2: known regressions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20070316081811.1163058f.rdunlap@xenotime.net>
Date:	Fri, 16 Mar 2007 08:18:11 -0700
From:	Randy Dunlap <rdunlap@...otime.net>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Ingo Molnar <mingo@...e.hu>,
	"Michael S. Tsirkin" <mst@...lanox.co.il>,
	Thomas Gleixner <tglx@...utronix.de>,
	Adrian Bunk <bunk@...sta.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Michal Piotrowski <michal.k.k.piotrowski@...il.com>,
	Emil Karlson <jkarlson@...hut.fi>,
	Soeren Sonnenburg <kernel@....de>, Len Brown <lenb@...nel.org>
Subject: Re: [5/6] 2.6.21-rc2: known regressions

On Tue, 6 Mar 2007 08:44:43 -0800 (PST) Linus Torvalds wrote:

> 
> This is just a coding style thing, but I thought I should really point it 
> out, because these kinds of things quite often result in nasty bugs simply 
> because the source code is so hard to read properly:
> 
> On Tue, 6 Mar 2007, Ingo Molnar wrote:
> >
> > -static void hrtimer_switch_to_hres(void)
> > +static int hrtimer_switch_to_hres(void)
> 
> Ok, so here's the quiz: does this function return "true on success, false 
> on failure", or does it return "zero on success, negative on failure"?
> 
> >  	if (base->hres_active)
> > -		return;
> > +		return 1;
> 
> Ahh, it must be "true on success", right?
> 
> >  	local_irq_save(flags);
> >  
> >  	if (tick_init_highres()) {
> >  		local_irq_restore(flags);
> > -		return;
> > +		return 0;
> 
> Ohh-oh! This is clearly a failure schenario! And indeed, 
> "tick_init_highres()" will do the "negative on failure, zero on success" 
> thing.
> 
> BUT! That means that you're testing the return value WRONG!
> 
> A function that returns a negative error value should be tested with
> 
> 	if (tick_init_highres() < 0) {
> 		local_irq_restore(flags);
> 		return 0;
> 	}
> 
> because now you *see* that it's a failure.
> 
> So here's the coding style:
> 
>  - "true on success, false on failure" should be tested by just doing the 
>    implicit test against zero (because that's how C booleans work!)
> 
>    Example:
> 
> 	if (everything_is_done())
> 		return;
> 
>    Or:
> 
> 	if (!something_worked_ok()) {
> 		printk("Aiee! Bug!\n");
> 		return;
> 	}
> 
>  - "negative error values" should preferably always be tested as such
> 
> 	if (tick_init_highres() < 0) {
> 		printk("Aieee! Couldn't init!\n");
> 		return 0;
> 	}
> 
>    or, much better, actually use a temporary variable called "err" or 
>    "error" or something, at which point "!error" is suddenly readable 
>    again:
> 
[1b]
> 	err = tick_init_highres();
> 	if (!err)
> 		return;

So this one above and [2] below lose the obvious "negative error"
information, and also prevent such functions from returning a
positive value (> 0), e.g., to indicate a successful amount of
work done (like bytes read or written).  The second version above
[1b] also does not quite agree with your statement:
  - "negative error values" should preferably always be tested as such

so in the interest of CodingStyle, can you be even clearer?


> I know this sounds stupid, but we've long since come to the point where 
> source code readability on a *local* scale is damn important, simply 
> because that's how people look at code: they may not always remember 
> whether "zero is success" or "zero is false".
> 
> In general, I would suggest:
> 
>  - ALWAYS use "negative means error". If you had done that in this case, 
>    then hrtimer_switch_to_hres() would have been a lot more readable, 
>    *and* it could actually have returned the error code that it got to the 
>    caller. In general, it's just more information when you see
> 
[2]
> 	error = some_function();
> 	if (error)
> 		return error;
> 
>    because even if it may generate basically *exactly* same code as the 
>    reversed "positive" version:
> 
> 	if (!some_version_is_true())
> 		return 0;
> 
>    it simply has more semantic information for *humans*.
> 
>    And when you do this, *test it as such*. Either use an explicit "< 0" 
>    so that you *see* that you're testing an error value, or use that 
>    "retval/error = xyzzy()" pattern that is already showing "it's more 
>    than just true/false"
> 
>  - use "true/false" only for things where it's *really* obvious that the 
>    answer is never an error, and always a "was it true"?
> 
> Yeah, even so, the true/false kind of thing may be more common (especially 
> with small helper functions that are literally *designed* to be used just 
> as a conditional), but I think in this case, you really should have done 
> it as a "returns error" function. Partly because now it was throwing away 
> an error code, partly simply because in this case, it really wasn't about 
> true/false as much as about "did something error out and keep it from 
> succeeding?".
> 
> Maybe I'm just getting anal in my old age. I at one time tried to make 
> sparse check for these things, but there was no really sane thing I could 
> come up with (way way WAAY too much manual annotation).
> 
> I might have to break down and suggest people use
> 
> 	bool somefunction(..)
> 	{
> 		if (... < 0)
> 			return false;
> 		...
> 		return true;
> 	}
> 
> just to (a) eventually have sparse check for these things but more 
> importantly (b) have people see more at a glance whether a function is 
> supposed to return "negative or success" or "true or false".
> 
> I've not generally been a huge fan of "boolean", especially in the 
> traditional C kind of sense (capital screaming letters, and really just an 
> "int" with lipstick). But with modern C, and "bool" defined as really 
> holding just 0/1 (in practice - "unsigned char"), we could actually check 
> these things (and verify with sparse that you never assign any integer 
> except for 0/1 to a boolean, and otherwise always have to use a real 
> boolean construct).
> 
> Thus endeth my overly long coding style rant.
> 
> 		Linus


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/