lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 10 May 2012 14:19:17 -0400
From:	Jean-Francois Dagenais <jeff.dagenais@...il.com>
To:	Grant Likely <grant.likely@...retlab.ca>,
	linus.walleij@...ricsson.com
Cc:	torvalds@...ux-foundation.org,
	open list <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>, eric.miao@...vell.com,
	maz@...terjones.org
Subject: gpio: pca953x: interrupt feature unreliable (re-re-send, Grant please acknowledge!!)

Hi all,

Here's a fun, yet potentially dangerous problem.

I've sent this twice already, without any feedback. I'm just trying to get more 
visibility on this because I believe it to be a serious issue depending where 
this chip may be used. Interrupts could be missed in what could be very tough 
conditions to replicate. In my setup I could reproduce the problem easily and 
took great care in analysing and documenting it.
===============Re-Re-Send (with edits)==================
Hi all,
(Using 3.1.0-rc4, didn't change since)

I have detected a flaw in the interrupt support of the gpio-pca953x driver. 
This is how I discovered the issue:

I have a interrupt client device which has it's interrupt signal connected to 
one of the GPIOs of the pca9555. The client is an ad714x device. It's interrupt 
routine is falling edge triggered (so says the driver, but according to the 
AD714x specs, it should be level,low, however this is another topic). The 
pca9555 is interrupting the CPU through INTA of an ethernet PCI card for which 
I have blacklisted the driver module and only done "echo 1 > 
/bus/pci/devices/00xyz/enable". It is "(level, low)" which matches the pca953x 
request_threaded_isr call (IRQF_TRIGGER_LOW | IRQF_ONESHOT). I used two GPOs on 
some other device to get the !ISR signals shown below. These show the nested 
nature of the cli ISR called from the pca953x's ISR.
ASCII art of my probing (use a monospace font, "!ISR"==low, mean ISR running)

Events(lined-up):1  2 3 4 5  6    7       8    9
             ____          _______
client !INT      |________|       |________(lost)___
             ___________                   _________
cli !ISR                |____|____________|
             ____      ___         _________________
pca953x !INT     |____|   |_______|
             ______                             ____
pca953x !ISR       |___________________________|
(note: the little spike in the cli !ISR marks the end of reading the client's 
status registers, the rest of the ISR is updating driver state machines, etc.)

Summary: The client enters interrupt(1), which immediately puts the pca953x in 
interrupt as well(1). When the pca953x enters it's isr thread(2), it reads the 
input status register, this immediately clears the pca953x interrupt(3). The 
pca953x ISR then figure's out it should invoke the nested ISR for the client. 
The client ISR starts (4) by reading the client chip's status register, which 
clears the client interrupt(5). This immediately puts the pca953x back into 
interrupt (5) since it's a different state than what was last read in the 
pca953x ISR(3). Since pca953x registered it's interrupt with IRQF_ONESHOT, this 
new IRQ state is masked and goes un-noticed. The pca953x chips' FLAW IS THEN 
REVEILED if/when the client chip goes back in interrupt before (7) the client 
ISR (and the pca053x's ISR with the current code) even has time to finish(8 and 
9). At that moment, the pca953x's input are back to the way they where when the 
pca953x ISR last read the input status register. So when all of the ISR 
routines are done (9...), the client has it's INT line asserted and no more 
interrupt will come from the pca953x because of this.

pca953x specs quote: "Resetting the interrupt circuit is achieved when data on 
the port is CHANGED TO THE ORIGINAL SETTING, data is read from the port that 
generated the interrupt or in a Stop event. [...] Interrupts that occur during 
the ACK or NACK clock pulse CAN BE LOST (or be very short) due to the resetting 
of the interrupt during this pulse."

So in essence, although the pca953x.c's design and doc says only edges are 
detected and supported, the chip design makes it IMPOSSIBLE to guarantee that 
all edges will be caught.

The IRQ function SHOULD BE REMOVED from the driver since the chip cannot honor 
the driver's promise. At the very least, for system integrators that want to 
use level base interrupts implemented by the pca953x, changes to the driver 
should be made to correctly handle the lack of LATCHING INTERRUPT STATUS 
register...

I would propose a patch, but I am missing knowledge on the internals of the 
kernel's irq code, so here's a simple suggestion description, hoping to get 
some feedback.

First of all, I propose pca953x_irq_set_type should only support 
IRQ_TYPE_LEVEL_HIGH and IRQ_TYPE_LEVEL_LOW. pca953x_irq_handler should be 
adjusted accordingly.

To make sure we catch as many transitions of pca953x chips INT as possible 
(yikes), we make pca953x_irq_setup() request it's own interrupt as both rising 
and falling (IRQF_TRIGGER_RISING|IRQF_TRIGGER_FALLING, is that even possible?) 
without IRQF_ONESHOT, so that pca953x_irq_handler is rescheduled for running 
even when it's already running.

Any thoughts? Maybe there are more/less fancy, or better yet, "correct" ways of 
doing this? Will this even work?

/jfd

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ