[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aJpTHKbLbTz-Z3bo@smile.fi.intel.com>
Date: Mon, 11 Aug 2025 23:31:24 +0300
From: Andy Shevchenko <andriy.shevchenko@...ux.intel.com>
To: Gabor Juhos <j4g8y7@...il.com>
Cc: Wolfram Sang <wsa@...nel.org>,
Wolfram Sang <wsa+renesas@...g-engineering.com>,
Andi Shyti <andi.shyti@...nel.org>,
Russell King <rmk+kernel@...linux.org.uk>,
Andrew Lunn <andrew@...n.ch>, Hanna Hawa <hhhawa@...zon.com>,
Robert Marko <robert.marko@...tura.hr>,
Linus Walleij <linus.walleij@...aro.org>, linux-i2c@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
Imre Kaloz <kaloz@...nwrt.org>, stable@...r.kernel.org
Subject: Re: [PATCH v2 3/3] i2c: pxa: handle 'Early Bus Busy' condition on
Armada 3700
On Mon, Aug 11, 2025 at 09:49:57PM +0200, Gabor Juhos wrote:
> Under some circumstances I2C recovery fails on Armada 3700. At least
> on the Methode uDPU board, removing and replugging an SFP module fails
> often, like this:
>
> [ 36.953127] sfp sfp-eth1: module removed
> [ 38.468549] i2c i2c-1: i2c_pxa: timeout waiting for bus free
> [ 38.486960] sfp sfp-eth1: module MENTECHOPTO POS22-LDCC-KR rev 1.0 sn MNC208U90009 dc 200828
> [ 38.496867] mvneta d0040000.ethernet eth1: unsupported SFP module: no common interface modes
> [ 38.521448] hwmon hwmon2: temp1_input not attached to any thermal zone
> [ 39.249196] sfp sfp-eth1: module removed
> ...
> [ 292.568799] sfp sfp-eth1: please wait, module slow to respond
> ...
> [ 625.208814] sfp sfp-eth1: failed to read EEPROM: -EREMOTEIO
>
> Note that the 'unsupported SFP module' messages are not relevant. The
> module is used only for testing the I2C recovery funcionality, because
> the error can be triggered easily with this specific one.
>
> Enabling debug in the i2c-pxa driver reveals the following:
>
> [ 82.034678] sfp sfp-eth1: module removed
> [ 90.008654] i2c i2c-1: slave_0x50 error: timeout with active message
> [ 90.015112] i2c i2c-1: msg_num: 2 msg_idx: 0 msg_ptr: 0
> [ 90.020464] i2c i2c-1: IBMR: 00000003 IDBR: 000000a0 ICR: 000007e0 ISR: 00000802
> [ 90.027906] i2c i2c-1: log:
> [ 90.030787]
>
> This continues until the retries are exhausted ...
>
> [ 110.192489] i2c i2c-1: slave_0x50 error: exhausted retries
> [ 110.198012] i2c i2c-1: msg_num: 2 msg_idx: 0 msg_ptr: 0
> [ 110.203323] i2c i2c-1: IBMR: 00000003 IDBR: 000000a0 ICR: 000007e0 ISR: 00000802
> [ 110.210810] i2c i2c-1: log:
> [ 110.213633]
>
> ... then the whole sequence starts again ...
>
> [ 115.368641] i2c i2c-1: slave_0x50 error: timeout with active message
>
> ... while finally the SFP core gives up:
>
> [ 671.975258] sfp sfp-eth1: failed to read EEPROM: -EREMOTEIO
>
> When we analyze the log, it can be seen that bit 1 and 11 is set in the
> ISR (Interface Status Register). Bit 1 indicates the ACK/NACK status, but
> the purpose of bit 11 is not documented in the driver code unfortunately.
>
> The 'Functional Specification' document of the Armada 3700 SoCs family
> however says that this bit indicates an 'Early Bus Busy' condition. The
> document also notes that whenever this bit is set, it is not possible to
> initiate a transaction on the I2C bus. The observed behaviour corresponds
> to this statement.
>
> Unfortunately, I2C recovery does not help as it never runs in this
> special case. Although the driver checks the busyness of the bus at
> several places, but since it does not consider the A3700 specific bit
> in these checks it can't determine the actual status of the bus correctly
> which results in the errors above.
>
> In order to fix the problem, add a new member to struct 'i2c_pxa' to
> store a controller specific bitmask containing the bits indicating the
> busy status, and use that in the code while checking the actual status
> of the bus. This ensures that the correct status can be determined on
> the Armada 3700 based devices without causing functional changes on
> devices based on other SoCs.
>
> With the change applied, the driver detects the busy condition, and runs
> the recovery process:
>
> [ 742.617312] i2c i2c-1: state:i2c_pxa_wait_bus_not_busy:449: ISR=00000802, ICR=000007e0, IBMR=03
> [ 742.626099] i2c i2c-1: i2c_pxa: timeout waiting for bus free
> [ 742.631933] i2c i2c-1: recovery: resetting controller, ISR=0x00000802
> [ 742.638421] i2c i2c-1: recovery: IBMR 0x00000003 ISR 0x00000000
>
> This clears the EBB bit in the ISR register, so it makes it possible to
> initiate transactions on the I2C bus again.
>
> After this patch, the SFP module used for testing can be removed and
> replugged numerous times without causing the error described at the
> beginning. Previously, the error happened after a few such attempts.
>
> The patch has been tested also with the following kernel versions:
> 5.10.237, 5.15.182, 6.1.138, 6.6.90, 6.12.28, 6.14.6. It improves
> recoverabilty on all of them.
...
> Note: the patch is included in this series for completeness however
> it can be applied independently from the preceding patches. On kernels
> 6.3+, it restores I2C functionality even in itself because it recovers
> the controller from the bad state described in the previous patch.
Sounds to me like this one should be applied first independently on the
discussion / conclusion on the patch 1.
...
Code wise it looks reasonable to me, but I haven't reviewed it properly
and wouldn't probably have a time, that's why no tags.
--
With Best Regards,
Andy Shevchenko
Powered by blists - more mailing lists