linux-kernel - I2C transfer offload on i2c-mv64xxx devices

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <f494cdfa-38c7-45a1-b511-ea8b2ff0090a@alliedtelesis.co.nz>
Date:   Tue, 19 Sep 2023 22:29:47 +0000
From:   Chris Packham <Chris.Packham@...iedtelesis.co.nz>
To:     "gregory.clement@...tlin.com" <gregory.clement@...tlin.com>,
        Andi Shyti <andi.shyti@...nel.org>
CC:     "linux-i2c@...r.kernel.org" <linux-i2c@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: I2C transfer offload on i2c-mv64xxx devices

Hi Gregory,

Are you able to provide a bit more context around 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6cf70ae928bae

To save you a click that's commit 6cf70ae928ba ("i2c: mv64xxx: Fix bus 
hang on A0 version of the Armada XP SoCs") basically you added the 
"marvell,mv78230-a0-i2c" compatible and used that to disable the I2C 
transfer offload feature. It's almost 10 years ago so I don't really 
expect anyone to remember.

I've been chasing an issue where certain I2C bus conditions (which I'm 
now injecting using another board and the i2c-gpio fault injection) 
cause a system wide lockup on some Marvell SoCs. The response I've got 
from Marvell via their FAE is that that these adverse bus conditions 
make the I2C controller assume that another master is accessing the bus, 
it will then wait for the other master to generate a STOP condition 
(which never happens).

Their suggestion was to check for the bus being idle (SDA/SCL high) 
before launching the transfer. That will avoid the issue if SCL or SDA 
are shorted to ground but didn't help with the lockup caused by the 
incomplete_address_phase or incomplete_write_byte. Their response to 
that was basically "meh, protocol error".

As a temporary workaround we ended up putting the MPP into gpio mode and 
making use of the i2c-gpio bus driver. That worked but has it's own 
downsides when the CPU gets busy.

Initially I thought this affected only the newer ARM64 ones (CN9130 and 
AC5) but I eventually found that from commit fbffee74986c ("ARM: dts: 
Fix I2C repeated start issue on Armada-38x") we've been using the 
"marvell,mv78230-a0-i2c" compatible string on the Armada-38x which is 
likely why I can't reproduce it on an Armada-385 based board. Using that 
compatible string to disable the offload on my AC5 based board and the 
CN9130-CRB seems to avoid the issue as well.

I need to do more testing but it's likely we'll run with that as a 
change for our boards. I'm also thinking that the I2C offload feature is 
not really suitable for boards where the I2C bus is not completely 
reliable (in my case this connected to SFP cages and we've seen all 
kinds of weird and wonderful errors due to different SFPs causing shorts 
or just generally misbehaving).

Does any of that sound like the issue from the A0 Armada XP?

Thanks,
Chris