[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m1irckmytv.fsf@ebiederm.dsl.xmission.com>
Date: Wed, 28 Mar 2007 22:57:16 -0600
From: ebiederm@...ssion.com (Eric W. Biederman)
To: Len Brown <lenb@...nel.org>
Cc: Ingo Molnar <mingo@...e.hu>, luming.yu@...el.com,
Adrian Bunk <bunk@...sta.de>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org, Thomas Meyer <thomas@...3r.de>,
Frederic Riss <frederic.riss@...il.com>,
Marcus Better <marcus@...ter.se>
Subject: Re: [patch] MSI-X: fix resume crash
Len Brown <lenb@...nel.org> writes:
>> Tony, Len the way pci_disable_device is being used in a suspend/resume
>> path by a few drivers is completely incompatible with the way irqs are
>> allocated on ia64. In particular people the following sequence occurs
>> in several drivers.
>>
>> probe:
>> pci_enable_device(pdev);
>> request_irq(pdev->irq);
>> suspend:
>> pci_disable_device(pdev);
>> resume:
>> pci_enable_device(pdev);
>> remove:
>> free_irq(pdev->irq);
>> pci_disable_device(pdev);
>
> There are no IA64 machines that support system suspend/resume today --
> so you have 0 chance of breaking the IA64 suspend/resume installed base.
Ok. So that is why the inconsistency persists...
> My understanding is that Luming Yu has cobbled IA64 S4 support
> together for a future release though.
>
>> What I'm proposing we do is move the irq allocation code out of
>> pci_enable_device and the irq freeing code out of pci_disable_device in
>> the future. If we move ia64 to a model where the irq number equal the
>> gsi like we have for x86_64 and are in the middle of for i386 that
>> should be pretty straight forward. It would even be relatively simple
>> to delay vector allocation in that context until request_irq, if we
>> needed the delayed allocation benefit. Do you two have any problems
>> with moving in that direction?
>
> I think consistency here would be _wonderful_.
> Of course the beauty of having identity GSI=IRQ and a /proc/interrupts
> that tells you what IOAPIC pin you are using become moot with MSI --
> but hey, showing the IRQ number rather than the vector number
> is consistent and makes sense.
Yes. It also allows for bigger machines. And I can get a consistent
number out of MSI if we allocate irq numbers in a sufficiently non-sparse
way. Something like bus|device|func|irq which is 8+5+3+12 or 28 bits...
I'll never get there though if i keep unearthing this long standing bugs.
>> If fixing the arch code is unacceptable for some reason I'm not aware of
>> we need to audit the 10-20 drivers that call pci_disable_device in their
>> suspend/resume processing and ensure that they have freed all of the
>> irqs before that point. Given that I have bug reports on the msi path I
>> know that isn't true.
>
> I think the suspend/resume interrupt logic needs some serious attention.
> We've had several schemes for suspend/resume of interrupts, several
> changes in strategy, and right now I think we are inconsistent,
> and frankly, I'm amazed it works at all.
What I have been doing lately is to aim at consistency in how a function
is called (and thus how it is expected to be used) and how it is actually
implemented. When I have a choice I try to pick a forgiving implementation
so that driver writers don't have to follow a magic correct path for
things to work correctly.
Removing the irq assignment from pci_enable_device is something that
matches implementation with use.
As for the rest it seems reasonable to me to allow an irq to be held
requested over suspend/resume and to save and restore apic and msi
capability state. Especially since irq numbers are a kernel
abstraction we should be able to do with them what we need to.
Honestly the whole suspend/resume thing is beyond me at this point I'm
laptop free... But I do know how to make code consistent with itself.
Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists