lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20120511185337.E31BC3E0791@localhost>
Date:	Fri, 11 May 2012 12:53:37 -0600
From:	Grant Likely <grant.likely@...retlab.ca>
To:	Jean-Francois Dagenais <jeff.dagenais@...il.com>,
	linus.walleij@...ricsson.com
Cc:	torvalds@...ux-foundation.org,
	open list <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>, eric.miao@...vell.com,
	maz@...terjones.org
Subject: Re: gpio: pca953x: interrupt feature unreliable (re-re-send, Grant please acknowledge!!)

On Thu, 10 May 2012 14:19:17 -0400, Jean-Francois Dagenais <jeff.dagenais@...il.com> wrote:
> Hi all,
> 
> Here's a fun, yet potentially dangerous problem.

Hi Jean-Francois,

It sounds like your analysis is accurate and your conclusions look
correct.  There really isn't much that I can do for you on this though
as I don't have the hardware.  As Linus mentioned, David Jander has
worked on the driver in the past, and git log shows some other folks.
I would ask them.

g.

> 
> I've sent this twice already, without any feedback. I'm just trying to get more 
> visibility on this because I believe it to be a serious issue depending where 
> this chip may be used. Interrupts could be missed in what could be very tough 
> conditions to replicate. In my setup I could reproduce the problem easily and 
> took great care in analysing and documenting it.
> ===============Re-Re-Send (with edits)==================
> Hi all,
> (Using 3.1.0-rc4, didn't change since)
> 
> I have detected a flaw in the interrupt support of the gpio-pca953x driver. 
> This is how I discovered the issue:
> 
> I have a interrupt client device which has it's interrupt signal connected to 
> one of the GPIOs of the pca9555. The client is an ad714x device. It's interrupt 
> routine is falling edge triggered (so says the driver, but according to the 
> AD714x specs, it should be level,low, however this is another topic). The 
> pca9555 is interrupting the CPU through INTA of an ethernet PCI card for which 
> I have blacklisted the driver module and only done "echo 1 > 
> /bus/pci/devices/00xyz/enable". It is "(level, low)" which matches the pca953x 
> request_threaded_isr call (IRQF_TRIGGER_LOW | IRQF_ONESHOT). I used two GPOs on 
> some other device to get the !ISR signals shown below. These show the nested 
> nature of the cli ISR called from the pca953x's ISR.
> ASCII art of my probing (use a monospace font, "!ISR"==low, mean ISR running)
> 
> Events(lined-up):1  2 3 4 5  6    7       8    9
>              ____          _______
> client !INT      |________|       |________(lost)___
>              ___________                   _________
> cli !ISR                |____|____________|
>              ____      ___         _________________
> pca953x !INT     |____|   |_______|
>              ______                             ____
> pca953x !ISR       |___________________________|
> (note: the little spike in the cli !ISR marks the end of reading the client's 
> status registers, the rest of the ISR is updating driver state machines, etc.)
> 
> Summary: The client enters interrupt(1), which immediately puts the pca953x in 
> interrupt as well(1). When the pca953x enters it's isr thread(2), it reads the 
> input status register, this immediately clears the pca953x interrupt(3). The 
> pca953x ISR then figure's out it should invoke the nested ISR for the client. 
> The client ISR starts (4) by reading the client chip's status register, which 
> clears the client interrupt(5). This immediately puts the pca953x back into 
> interrupt (5) since it's a different state than what was last read in the 
> pca953x ISR(3). Since pca953x registered it's interrupt with IRQF_ONESHOT, this 
> new IRQ state is masked and goes un-noticed. The pca953x chips' FLAW IS THEN 
> REVEILED if/when the client chip goes back in interrupt before (7) the client 
> ISR (and the pca053x's ISR with the current code) even has time to finish(8 and 
> 9). At that moment, the pca953x's input are back to the way they where when the 
> pca953x ISR last read the input status register. So when all of the ISR 
> routines are done (9...), the client has it's INT line asserted and no more 
> interrupt will come from the pca953x because of this.
> 
> pca953x specs quote: "Resetting the interrupt circuit is achieved when data on 
> the port is CHANGED TO THE ORIGINAL SETTING, data is read from the port that 
> generated the interrupt or in a Stop event. [...] Interrupts that occur during 
> the ACK or NACK clock pulse CAN BE LOST (or be very short) due to the resetting 
> of the interrupt during this pulse."
> 
> So in essence, although the pca953x.c's design and doc says only edges are 
> detected and supported, the chip design makes it IMPOSSIBLE to guarantee that 
> all edges will be caught.
> 
> The IRQ function SHOULD BE REMOVED from the driver since the chip cannot honor 
> the driver's promise. At the very least, for system integrators that want to 
> use level base interrupts implemented by the pca953x, changes to the driver 
> should be made to correctly handle the lack of LATCHING INTERRUPT STATUS 
> register...
> 
> I would propose a patch, but I am missing knowledge on the internals of the 
> kernel's irq code, so here's a simple suggestion description, hoping to get 
> some feedback.
> 
> First of all, I propose pca953x_irq_set_type should only support 
> IRQ_TYPE_LEVEL_HIGH and IRQ_TYPE_LEVEL_LOW. pca953x_irq_handler should be 
> adjusted accordingly.
> 
> To make sure we catch as many transitions of pca953x chips INT as possible 
> (yikes), we make pca953x_irq_setup() request it's own interrupt as both rising 
> and falling (IRQF_TRIGGER_RISING|IRQF_TRIGGER_FALLING, is that even possible?) 
> without IRQF_ONESHOT, so that pca953x_irq_handler is rescheduled for running 
> even when it's already running.
> 
> Any thoughts? Maybe there are more/less fancy, or better yet, "correct" ways of 
> doing this? Will this even work?
> 
> /jfd
> 

-- 
Grant Likely, B.Sc, P.Eng.
Secret Lab Technologies, Ltd.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ