[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <513DCB65.3010301@redhat.com>
Date: Mon, 11 Mar 2013 08:17:41 -0400
From: Prarit Bhargava <prarit@...hat.com>
To: Neil Horman <nhorman@...driver.com>
CC: Myron Stowe <myron.stowe@...il.com>, linux-kernel@...r.kernel.org,
Don Zickus <dzickus@...hat.com>,
Don Dutile <ddutile@...hat.com>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Asit Mallick <asit.k.mallick@...el.com>,
linux-pci@...r.kernel.org
Subject: Re: [PATCH v2] irq: add quirk for broken interrupt remapping on 55XX
chipsets
On 03/11/2013 07:25 AM, Neil Horman wrote:
> On Sat, Mar 09, 2013 at 03:20:57PM -0700, Myron Stowe wrote:
>> On Sat, Mar 9, 2013 at 1:49 PM, Neil Horman <nhorman@...driver.com> wrote:
>>> On Mon, Mar 04, 2013 at 02:04:19PM -0500, Neil Horman wrote:
>>>> A few years back intel published a spec update:
>>>> http://www.intel.com/content/dam/doc/specification-update/5520-and-5500-chipset-ioh-specification-update.pdf
>>>>
>>>> For the 5520 and 5500 chipsets which contained an errata (specificially errata
>>>> 53), which noted that these chipsets can't properly do interrupt remapping, and
>>>> as a result the recommend that interrupt remapping be disabled in bios. While
>>>> many vendors have a bios update to do exactly that, not all do, and of course
>>>> not all users update their bios to a level that corrects the problem. As a
>>>> result, occasionally interrupts can arrive at a cpu even after affinity for that
>>>> interrupt has be moved, leading to lost or spurrious interrupts (usually
>>>> characterized by the message:
>>>> kernel: do_IRQ: 7.71 No irq handler for vector (irq -1)
>>>>
>>>> There have been several incidents recently of people seeing this error, and
>>>> investigation has shown that they have system for which their BIOS level is such
>>>> that this feature was not properly turned off. As such, it would be good to
>>>> give them a reminder that their systems are vulnurable to this problem.
>>>>
>>>> Signed-off-by: Neil Horman <nhorman@...driver.com>
>>>> CC: Prarit Bhargava <prarit@...hat.com>
>>>> CC: Don Zickus <dzickus@...hat.com>
>>>> CC: Don Dutile <ddutile@...hat.com>
>>>> CC: Bjorn Helgaas <bhelgaas@...gle.com>
>>>> CC: Asit Mallick <asit.k.mallick@...el.com>
>>>> CC: linux-pci@...r.kernel.org
>>>>
>>> Ping, anyone want to Ack/Nack this?
>>
>> Don's comment earlier seems to imply that this is a short term fix and
>> that a more long term fix may be coming soon. If that is the case
>> wouldn't we want to wait for the long term fix and just pull that in?
>>
>> Myron
>>
> As Don and Prarit have mentioned, an alternate change is being worked on and
> tested that may work around this issue, but we're not yet sure that it will, and
> we're not sure of the time frame for this fix. Normally I would agree, that it
> would be easier just to wait for the long term fix, but as Prarit noted, since
> this hardware is in fact broken, I would rather do a both approach. Its fine if
> this gets reverted tomorrow with a longer term fix as far as I'm concerned, its
> just caused enough problems already that I'd like to see it in place until the
> better solution arrives.
I agree with Neil on this. While vendors are supposed to fix their BIOSes,
experience has shown that not all vendors will fix their BIOSes for a problem
like this.
Ack this quirk.
P.
> Neil
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists