[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m1my8mukyw.fsf@fess.ebiederm.org>
Date: Fri, 05 Jun 2009 19:58:31 -0700
From: ebiederm@...ssion.com (Eric W. Biederman)
To: suresh.b.siddha@...el.com
Cc: "mingo\@elte.hu" <mingo@...e.hu>, "hpa\@zytor.com" <hpa@...or.com>,
"tglx\@linutronix.de" <tglx@...utronix.de>,
"linux-kernel\@vger.kernel.org" <linux-kernel@...r.kernel.org>,
"ak\@linux.intel.com" <ak@...ux.intel.com>,
"travis\@sgi.com" <travis@....com>,
"steiner\@sgi.com" <steiner@....com>,
Gary Hade <garyhade@...ibm.com>
Subject: Re: [patch] x64: Avoid irq_chip mask/unmask in fixup_irqs for interrupt-remapping
Suresh Siddha <suresh.b.siddha@...el.com> writes:
> On Thu, 2009-06-04 at 18:47 -0700, Eric W. Biederman wrote:
>> As far as this patch goes it looks like an improvement.
>>
>> Acked-by: "Eric W. Biederman" <ebiederm@...ssion.com>
>
> Thanks. Ingo, Peter: Can you please queue this patch? Eric, more
> comments below.
>
>> However after looking at Gary's issues I see some things that are still wrong
>> on this path.
>>
>> 1) We don't do the part of irq migration that moves irq threads.
>> We aren't using irq threads yet but still
>>
>
> Ok. Will look at the irq threads code (-rt tree?).
In 2.6.30 No one is using them yet but the code is merged.
>> If we could figure out how to call irq_set_affinity for the IRQ_MOVE_PCNTXT
>> code path that would make the maintenance a lot simpler.
>
> Ok. Will post this cleanup sometime.
Thanks.
>> 2) We still diverge on 32bit vs 64bit for no reason.
>> I expect the fixed 64bit version should be moved into apic/io_apic.c
>
> Agree.
>
>>
>> 3) We still enable irqs for a short while after this to let things drain.
>> I am wondering if that is really necessary. It does very simply
>> allow the irq cleanup ipi to happen, and it unjams any irqs that happened
>> before we migrated them.
>
> Yes.
>
>> If we wanted to very strictly follow the rules I guess we could do something
>> like the cleanup_ipi by hand on the cpu that is going down and rebroadcast
>> all of the pending irqs to another cpu to process.
>>
>
> This will be ok for all the cases except the most difficult case (non
> interrupt-remapping and IO-APIC level triggered). We should service the
> pending interrupt from the same cpu 'X' (that is going down) because of
> the vector information (that is cpu 'X' specific) in the IO-APIC RTE.
> Otherwise we have to do directed EOI ( and I am not sure if all the
> IO-APIC's support that) to the io-apic.
>
> Is there a problem with enabling irqs in the fixup_irqs()?
Strictly speaking it is against the rules. At least it was last time
I audited the cpu hotunplug path. In practice it seems to work well.
> My old proposal (which will fix the stuck IRR issue seen by Gary) for
> fixing the level irq migration in fixup_irqs() is to do something like:
>
> Disable interrupts
> ..
> Mask io-apic RTE
> check if the IRR is set
> if so,
> unmask the io-apic RTE
> enable interrupts
> and go back to top
> else
> ok to migrate the io-apic rte.
>
> So if there is any other reason for keeping the interrupts disabled
> during fixup_irqs(), then we need to think of another strategy to
> address this.
As long as fixup_irqs is where we grow the nasty work-arounds and
we don't burden the other saner paths. I am generally in favor of it.
> Otherwise, If everyone agrees with this direction then we can try to
> comeup with a clean patch for this.
There is also something else we can do. We can register a cpu hotplug
notifier and tell hotunplug to fail if we dealing with a case that
we can not actually support cleanly.
At the same time there are a couple of more case that we can move into
process context. MSI cpu irq migration being the primary.
It needs a really good testing but I think all of lowest priority interrupt
delivery can be done safely from process context as well. Since we only
need a single register write and if we have properly shutdown the cpu
the hardware just won't deliver irqs to it automagically.
My pipe dream is that we can move just enough irq migration cases into
process context that the suspend to ram code code will be happy. And
we can make cpu hotunplug fail if the irqs are only safe to migrate
in interrupt context.
Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists