[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20061116042432.26300.qmail@science.horizon.com>
Date: 15 Nov 2006 23:24:32 -0500
From: linux@...izon.com
To: linux-kernel@...r.kernel.org, torvalds@...l.org
Cc: linux@...izon.com
Subject: Re: [PATCH] ALSA: hda-intel - Disable MSI support by default
> It all boils down to the same thing: either we have to know that MSI works
> (where "know" is obviously relative - it's not like you can avoid _all_
> bugs, but dammit, even a single report of "not working" means that there
> are probably a ton of machines like that, and we did something wrong), or
> we shouldn't use it. There is no middle ground. You can't really safely
> "test" for it, and while you _can_ say "just do both", it doesn't really
> help anything (and potentially exposes you to just more bugs: if enablign
> MSI actually _does_ disable INTx, but then doesn't work, at a minimum you
> end up with a device that doesn't work, even if the rest of the kernel
> might be ok).
Er... why can't you test it?
The fundamental problem in IRQ routing is that if you have it wrong,
you have a screaming interrupt that you can't shut up.
Well, before giving up entirely, assume that *some* device owns that
interrupt, it's just mis-routed.
So start calling the IRQ handlers for *every* PCI device until the
damn interrupt goes away, or you've really proved that it can't
be shut up.
If you have interrupts coming in fast, you might have to retry a few
times to be sure there's nothing to be done, but that's nothing new.
Now, if you get really nasty with the locking, you can disable all
other interrupt handlers on all processors until you've dealt with
the screaming interrupt, and when it goes away, you can point the
finger conclusively at the device which has the misrouted interrupt.
And you've also found where its interrupt *is* routed.
If you don't have such nasty locking, then you only have a strong
suspiscion about what did it, but a few repetitions can firm that up.
Any time a device handles an interrupt that arrived from the expected
place, its index of suspiscion goes down. You can even use this
to select an order to call the various PCI drivers.
All of this can be rather time-consuming and mess up real-time response,
but it's only once per boot, and being able to point the finger accurately
at buggy hardware rather than not working at all for mysterious reasons is
quite nice. And an increasing number of BIOS vendors and OEMs are testing
with Linux, even if only cursorily, so there is some (slow) feedback.
Whether you actually learn to cope with misrouted interrupts is
a separate issue that I won't raise.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists