linux-kernel - Re: 2.6.19-rc1 genirq causes either boot hang or "do

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <m1r6xjx4b4.fsf@ebiederm.dsl.xmission.com>
Date:	Sat, 07 Oct 2006 14:24:15 -0600
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Linus Torvalds <torvalds@...l.org>
Cc:	Muli Ben-Yehuda <muli@...ibm.com>, Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Rajesh Shah <rajesh.shah@...el.com>, Andi Kleen <ak@....de>,
	"Protasevich, Natalie" <Natalie.Protasevich@...SYS.com>,
	"Luck, Tony" <tony.luck@...el.com>, Andrew Morton <akpm@...l.org>,
	Linux-Kernel <linux-kernel@...r.kernel.org>,
	Badari Pulavarty <pbadari@...il.com>
Subject: Re: 2.6.19-rc1 genirq causes either boot hang or "do_IRQ: cannot handle IRQ -1"

Linus Torvalds <torvalds@...l.org> writes:

> On Sat, 7 Oct 2006, Eric W. Biederman wrote:
>> 
>> I am hoping that by running the apics in a different delivery mode
>> that explicitly says just deliver this interrupt to this cpu we
>> will avoid the problem you are seeing.
>
> Note that having too strict delivery modes could be a major pain in the 
> future, with things like multicore CPU's a lot more actively doing power 
> management on their own, and effectively going into sleep-states with 
> reasonably long latencies.

Sure.

> Especially with schedulers that are aware of things like that (and we 
> _try_, at least to some degree, and people are interested in more of it), 
> you can easily be in the situation that one of the cores is being fairly 
> actively kept in a low-power state, and can have millisecond latencies 
> (not to mention no L1 cache contents etc).
>
> So I really do think that the belief that we should force irqs to a 
> particular core is fundamentally flawed.

For me this isn't about forcing an irq to a particular cpu.  It
is about not having global vector allocation, because that simply
cannot scale.

Being able to allocate a vector for just a subset of the cpus means we can
support arbitrarily large systems.  Making the size of the pool a single
cpu was the simplest implementation of that idea.

> We used to do lowest-priority stuff in hw, and then Intel broke it, but I 
> always told them that they were _stupid_ to break it. The fact is, 
> especially with multi-core, it actually makes a lot of sense to have 
> hardware decide which core to interrupt, because hardware simply 
> potentially knows better.
>
> This is one of those age-old questions: in _theory_ you can do a better 
> job in software, but in _practice_ it's just too damn expensive and 
> complicated to do a perfect job especially with dynamic decisions, so in 
> _practice_ it tends to be better to let hardware make some of the 
> decisions.
>
> We can see the same thing in instruction scheduling: in _theory_ a 
> compiler can do a better job of scheduling, since it can spend inordinate 
> amounts of resources on doing things once, and then the hardware can be 
> simpler and faster and never worry about it. In _practice_, however, the 
> biggest scheduling decisions are all dynamic at run-time, and depend on 
> things like cache misses etc, and only total idiots (or embedded people) 
> will do static scheduling these days.
>
> I think it's a huge mistake to do static interrupt routing for the same 
> reason.

I have no problem with that.  The only place where I caused a behavior
changes on x86_64 is genapic_flat which does this, and I figured it was
not a big deal simply because CONFIG_CPU_HOTPLUG is the default so it
is rarely used.  I figured if  my implementation was too simple someone
would scream and I could add the complexity to the vector allocator to
enable lowest priority interrupt delivery.

Well someone has screamed :)

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/