lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 30 Jan 2013 13:08:50 +0200
From:	Pantelis Antoniou <panto@...oniou-consulting.com>
To:	Mugunthan V N <mugunthanvnm@...com>
Cc:	Richard Cochran <richardcochran@...il.com>,
	Matt Porter <mporter@...com>,
	Chase Maupin <chase.maupin@...com>, Jason Kridner <jdk@...com>,
	Tony Lindgren <tony@...mide.com>, <linux-omap@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] cpsw: Fix interrupt storm among other things

Hi Mugunthan,

On Jan 30, 2013, at 12:55 PM, Mugunthan V N wrote:

> On 1/30/2013 3:06 PM, Pantelis Antoniou wrote:
>> Hi,
>> 
>> On Jan 30, 2013, at 11:03 AM, Mugunthan V N wrote:
>> 
>>> On 1/30/2013 2:06 PM, Pantelis Antoniou wrote:
>>>> Hi Mugunthan,
>>>> 
>>>> On Jan 29, 2013, at 1:45 PM, Mugunthan V N wrote:
>>>> 
>>>>> On 1/28/2013 6:41 PM, Pantelis Antoniou wrote:
>>>>>> Fix interrupt storm on bone A4 cause by non-by-the-book interrupt handling.
>>>>>> While at it, added a non-NAPI mode (which is easier to debug), plus
>>>>>> some general fixes.
>>>>>> 
>>>>>> Signed-off-by: Pantelis Antoniou <panto@...oniou-consulting.com>
>>>>>> ---
>>>>>>  Documentation/devicetree/bindings/net/cpsw.txt |   1 +
>>>>>>  drivers/net/ethernet/ti/cpsw.c                 | 222 +++++++++++++++++++++----
>>>>>>  drivers/net/ethernet/ti/davinci_cpdma.c        |   4 +-
>>>>>>  drivers/net/ethernet/ti/davinci_cpdma.h        |   2 +-
>>>>>>  include/linux/platform_data/cpsw.h             |   1 +
>>>>>>  5 files changed, 194 insertions(+), 36 deletions(-)
>>>>> I have tested CPSW on AM335x EVM 1.5A with flood ping and i am not
>>>>> seeing any interrupt storm.
>>>>> Can you provide more details on how to reproduce the issue.
>>>>> 
>>>> A beaglebone prototype with the new silicon version, with the ethernet errata
>>>> fixed displays this. You can't trigger it on old silicon.
>>>> 
>>>> The TI people on the CC list can confirm.
>>> But i have the same silicon revision (PG2.0) in my EVM and I am not seeing any issues. Can you
>>> point me to the ethernet errata which you are mentioning?
>>> 
>>> Regards
>>> Mugunthan V N
>> What kernel version are you using? This is only triggered on the mainline driver.
>> 
>> The advisory in question: From http://www.ti.com/lit/er/sprz360c/sprz360c.pdf
>> 
>> Advisory 1.0.9: "Ethernet Media Access Controller and Switch Subsystem: C0_TX_PEND
>> and C0_RX_PEND Interrupts Not Connected to ARM Cortex-A8"
>> 
>> I bet you're using an old kernel driver with the workarounds with the timers.
>> 
>> If I had to guess (although I didn't use a probe or anything) is that the
>> interrupts are now proper level interrupts, instead of working in edge
>> triggered mode due to the workaround.
>> 
>> Apparently the interrupt was never acked properly in the original driver
>> (the sequence described in the TRM is not followed).
>> 
>> Looking at the TRM (spruh73g.pdf) 14.3.1.3 Interrupts in particular, the
>> the status registers are not read, and more damning the proper values to the
>> CPDMA_EOI_VECTOR register are not written.
>> 
>> The original driver blindly wrote zero (cpdma_ctlr_eoi), while you have to
>> write different values according to the interrupt you ack.
>> 
>> What happened was that on the first interrupt, the interrupt was never acked,
>> and we had an irq storm...
>> 
>> Regards
>> 
>> -- Pantelis
> The above mentioned advisory is for PG1.0 and not for PG2.0
> I am booting net-next kernel.
> 
> [    0.000000] Booting Linux on physical CPU 0x0
> [    0.000000] Linux version 3.8.0-rc5-01248-gd2ed273 (a0131834@...31834-linux) (gcc version 4.5.3 20110311 (prerelease) (GCC) ) #21 SMP Wed Jan 30 163
> [    0.000000] CPU: ARMv7 Processor [413fc082] revision 2 (ARMv7), cr=10c53c7d
> [    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
> 
> [root@...go /]# uname -a
> Linux arago 3.8.0-rc5-01248-gd2ed273 #21 SMP Wed Jan 30 16:13:26 IST 2013 armv7l GNU/Linux
> 
> In theory what you are mentioning is correct. I have a beagle bone black and yet to try it.
> 

I don't know what kind of silicon revision you have there.

FWIW both the original bone and the black show the exact same CPU: ARMv7 lines:

bone-original:
[    0.000000] CPU: ARMv7 Processor [413fc082] revision 2 (ARMv7), cr=10c5387d

bone-black: (with the known silicon version that has the fix)
[    0.000000] CPU: ARMv7 Processor [413fc082] revision 2 (ARMv7), cr=10c5387d

TBH I haven't found a simple way to print out the silicon revision number.
Anyone on the list know a quick and dirty method? 

> Regards
> Mugunthan V N

Regards

-- Pantelis


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists