[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <08FE5CC30C9A3F41BF819A502CF7BF6E0100376E@fmsmsx411.amr.corp.intel.com>
Date: Fri, 30 Mar 2007 13:49:20 -0700
From: "Williams, Mitch A" <mitch.a.williams@...el.com>
To: "Eric W. Biederman" <ebiederm@...ssion.com>,
"Greg KH" <gregkh@...e.de>
Cc: "Andrew Morton" <akpm@...ux-foundation.org>,
<linux-pci@...ey.karlin.mff.cuni.cz>,
<linux-kernel@...r.kernel.org>,
"Kok, Auke-jan H" <auke-jan.h.kok@...el.com>
Subject: RE: [PATCH 2.6.21-rc5] Flush MSI-X table writes (rev 3)
Eric W. Biederman wrote:
>The bug report would be phrased as someone seeing "No irq for vector"
>on x86_64. Unless they are a skilled developer they are unlikely to
>trace it down to not flushing posted writes to a MSI bar during irq
>migration. It part it is a subtle hardware/software race.
>
>I have seen some unresolved bug reports described this way on machines
>that have MSI-X capable hardware. The are against fedora core and I
>can't haven't been able to get the reporters to try 2.6.21-rcX and
>the other irq back ports don't work so well.
>
>I currently count 7 drivers in 2.6.21 that are susceptible to the race
>this bug closes.
>
>This version of the patch seems to meet all of the criteria for being
>obviously correct, and all it does is insert a stupid readl. Not
>very dangerous.
>
>This fix removes a race between hardware and software that you likely
>need to migrate MSI-X irqs a lot to trigger. Which can be the kind of
>thing that bytes you after you have a month of uptime. This is a very
>nasty condition to track down from bug reports. I'd rather not have a
>known culprit out there.
Agreed, this is a subtle bug, and was a real hairball to track down.
Even so, I'm surprised that nobody else has dug into this, since it
should affect anybody running MSI-X. I originally thought I was seeing
a hardware bug, which is why I dug more deeply into the issue.
If Eric is seeing bug reports related to "no vector for IRQ" in the
wild, then I have to change my stance and agree that this should be
pushed to -stable. Every one of those messages indicates that we
hit the race condition.
-Mitch
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists