linux-kernel - Re: Serial related oops

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-id: <45DA84A9.5010908@shaw.ca>
Date:	Mon, 19 Feb 2007 23:18:33 -0600
From:	Robert Hancock <hancockr@...w.ca>
To:	"Michael K. Edwards" <medwards.linux@...il.com>
Cc:	Jose Goncalves <jose.goncalves@...v.pt>,
	Frederik Deweerdt <deweerdt@...e.fr>,
	akpm@...ux-foundation.org, linux-kernel@...r.kernel.org
Subject: Re: Serial related oops

Michael K. Edwards wrote:
> Of course not.  But dealing with a stuck IRQ line by locking up isn't
> very practical either.  IRQ sharing is stupid yet universal, and it

And we don't, that's why we have that "nobody cared" logic that disables 
the interrupt line if no driver services the interrupt. That doesn't 
provide a clean recovery, of course, it's meant to notify the user of 
what happened so that the problem can be fixed.

> happens all the time that a device that has been sitting there minding
> its own business since power-up, with no driver to drive it, decides
> to assert its IRQ.  Maybe it just got hot-plugged, maybe it just got
> its first dribble of input, whatever.  Other devices on the shared IRQ
> are screwed (or at least semi-screwed; you could periodically
> re-enable the IRQ long enough to make a run through the ISR chain
> servicing the other devices).  But if you run "lspci" (or whatever)
> and load a driver for the newly awake device, everything goes back to
> normal.
> 
> For devices compiled into the kernel, you shouldn't have to play these
> games.  If, that is, there were three stages of driver initialization,
> called in successive passes:

Exactly, for devices compiled into the kernel. In most setups this is 
only a fraction of all devices, so solving this problem only for drivers 
built into the kernel is no solution.

> 
> 1) installing an ISR with a fallback STFU path (device-specific but
> not dependent on any particular pre-existing chip state), quiescing it
> if you know how and registering for the IRQ if you know which it is;
> 
> 2) going through the chip's soft-reset-wake-up-shut-up cycle and
> populating driver data structures, possibly correcting the IRQ
> registration along the way;
> 
> 3) ready-as-we'll-ever-be, bring on the interrupts.
> 
> You probably can't help enabling the IRQ briefly during 2) so that you
> can do tests like Russell's loopback.  But it's a needless gamble to
> do that without doing 1) for all compiled-in drivers and platform
> devices first, in a previous discovery pass.  And it's stupid to do 3)
> in the same pass as 2), because you'll just open race condition
> windows that will only bite when an all-the-way-live device raises its
> IRQ at a moment when the writer of the wake-up-shut-up code wasn't
> expecting it.  All code has bugs and they're only a problem when they
> bite in the field.
> 
>> If a system has a device that generates interrupts before they're 
>> enabled,
>> and the firmware doesn't fix it, then some platform-specific quirk has to
>> handle it and shut off the interrupt before it allows any interrupts
>> to be enabled. (We have such a quirk for certain network controllers 
>> where
>> the boot ROM can leave the chip generating interrupts on bootup.)
> 
> You don't need quirks if your driver initialization is bomb-proof to
> begin with.  Devices that are quiet on power-up are purely
> coincidental and should not be construed.

It's not coincidental, it is the only sane way to design hardware. You 
just can't go firing off interrupts without a driver having 
intentionally enabled them. There are a few devices that have had such 
issues, but they have been few and far between, certainly not enough to 
warrant the complexity of the scheme you propose.

-- 
Robert Hancock      Saskatoon, SK, Canada
To email, remove "nospam" from hancockr@...pamshaw.ca
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/