lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m1k5zxgplv.fsf@ebiederm.dsl.xmission.com>
Date:	Mon, 08 Jan 2007 14:45:00 -0700
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Adrian Bunk <bunk@...sta.de>
Cc:	Linus Torvalds <torvalds@...l.org>,
	Tobias Diedrich <ranma+kernel@...edrich.de>,
	Yinghai Lu <yinghai.lu@....com>, Andrew Morton <akpm@...l.org>,
	Andi Kleen <ak@...e.de>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	mingo@...hat.com, discuss@...-64.org
Subject: Re: [PATCH 4/4] x86_64 ioapic: Improve the heuristics for when check_timer fails.

Adrian Bunk <bunk@...sta.de> writes:

> On Mon, Jan 08, 2007 at 09:11:24AM -0700, Eric W. Biederman wrote:
>> 
>> To a large extent this reverts b026872601976f666bae77b609dc490d1834bf77
>> while still keeping to the spirits of it's goal, the ability to
>> make smart guesses about how the timer irq is routed when the BIOS
>> gets it wrong.
>>...
>
> That's code where every changed line has a great potential of causing a 
> different kind of breakage on someone else's computer.

Why does this piece of code give every one the screaming hebie jebies?
I read it I understand it, it is code.

This code is not a terribly sensitive delicate heuristic, and Andi has
already broken it as much as it can possibly be broken.  It's not like
the code is on a SMP fastpath full of carefully orchestrated races
that are safe because within certain limits even stale values are ok.

This is code is straight forward logic, you tell the computer what to
do and it does it.  Of those things we can do only very few of
them are correct, and we are seeking to enhance our ability to find
correct solutions by adding intelligent guesses.  As long as the first
guess is trust the BIOS the rest of this code is largely a don't
care.  As Andi proved by breaking all the rest of this.  Or why
don't I have more testers just crawling out of the wood work,
screaming for this code to be fixed?

Plus this code can only cause one type of breakage.   A failure to
work around a broken BIOS and make the IRQs work.

> Your comment therefore translates to "rexvert commit 
> b026872601976f666bae77b609dc490d1834bf77 for 2.6.20 and try to find a 
> better solution for 2.6.21".

If that is the practical translation I am fine with it.

Linus said he wanted to try in the 2.6.20 timeframe.  Although things
have probably dragged on much to long.

The heuristics tests in my patch actually do what they try to do. 
Unlike Andi's where only trusting the BIOS supplied default works.
So from where we are it is a clear benefit.  b026872601976f666bae77b609dc490d1834bf77
is broken.  I just found it easier and more evolutionary to starting
with a working code base.  Not that my result is radically different
codewise from what Andi and tried.

Everything that worked in 2.6.19 should also work.  Except for the ATI
case we still have all of the other heurstic work abounds supplied by
quirks instead of supplied manually, and the ATI fix was to not enable
interrupts on the i8259 and then expect the ioapic to work, which we
have sensibly made a global default.

I really don't care how we do it, or in what timeframe.  But what I have
posted is the only way I can see of making it better, than what we had
in 2.6.19.

I would like to see my code take the conversation to talking about
what intelligent guesses we should try when the BIOS get's it wrong
instead of being stuck where we have with Andi's code about how do we
make the heuristic guesses we want to try actually run.

Now if someone wants to see a beautiful history we can revert Andi's commit
Implement the code motion parts of mine into functions.  Implement what it
takes to make those functions safe to call multiple times, and then call
extra times as heuristic guesses to get the hardware correct.  There
might be some sense in that, although it feels like trying to hard to me.

I definitely think in the 2.6.21 timeframe it makes sense to put this code in
on i386 as well.  Otherwise the maintenance will just be crazy.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ