[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <509abfeb-b1fb-4c53-9898-6106c8dde411@lunn.ch>
Date: Tue, 27 Aug 2024 14:18:31 +0200
From: Andrew Lunn <andrew@...n.ch>
To: Jiawen Wu <jiawenwu@...stnetic.com>
Cc: andi.shyti@...nel.org, jarkko.nikula@...ux.intel.com,
andriy.shevchenko@...ux.intel.com, mika.westerberg@...ux.intel.com,
jsd@...ihalf.com, davem@...emloft.net, edumazet@...gle.com,
kuba@...nel.org, pabeni@...hat.com, rmk+kernel@...linux.org.uk,
linux-i2c@...r.kernel.org, netdev@...r.kernel.org,
mengyuanlou@...-swift.com, duanqiangwen@...-swift.com
Subject: Re: [PATCH net 0/3] Add I2C bus lock for Wangxun
> > More details please.
> >
> > Linux assume it is driving the hardware. Your firmware cannot be
> > touching any registers which will clear on read. QSFP states that
> > registers 3-31 of page 0 are all clear on read, for example. The
> > firmware should also not be setting any registers, otherwise you can
> > confuse Linux which assumes registers it set stay set, because it is
> > controlling the hardware.
> >
> > Your firmware also needs to handle that Linux can change the page. If
> > the firmware changes the page, it must restore it back to whatever
> > page Linux selected, etc.
> >
> > The fact you are submitting this for net suggests you have seen real
> > issues. Please describe what those issues are.
>
> The error log shows:
>
> [257681.367827] sfp sfp.1025: Host maximum power 1.0W
> [257681.370813] txgbe 0000:04:00.1: 31.504 Gb/s available PCIe bandwidth, limited by 8.0 GT/s PCIe x4 link at 0000:02:02.0 (capable
> of 63.008 Gb/s with 8.0 GT/s PCIe x8 link)
> [257681.373364] txgbe 0000:04:00.1 enp4s0f1: renamed from eth0
> [257681.434719] txgbe 0000:04:00.1 enp4s0f1: configuring for inband/10gbase-r link mode
> [257681.676747] sfp sfp.1025: EEPROM base structure checksum failure: 0x63 != 0x1f
> [257681.676755] sfp EE: 00000000: 03 04 07 10 00 00 01 00 00 00 00 06 67 02 00 00 ............g...
> [257681.676757] sfp EE: 00000010: 1e 0f 00 00 46 69 62 65 72 53 74 6f 72 65 20 64 ....FiberStore d
> [257681.676759] sfp EE: 00000020: 20 20 20 20 00 00 1b 21 53 46 50 2d 31 30 47 53 ...!SFP-10GS
> [257681.676760] sfp EE: 00000030: 52 2d 38 35 20 20 20 20 41 20 20 20 03 52 00 1f R-85 A .R..
> [257681.676762] sfp EE: 00000040: 00 81 cd 5b df 25 0a bd 40 f6 c6 ce 47 8e ff ff ...[.%..@...G...
> [257681.676763] sfp EE: 00000050: 10 d8 24 33 44 8e ff ff 10 41 b0 9a ff ff ff ff ..$3D....A......
>
> It looks like some fields are read incorrectly. For comparison, I printed the
> SFP info when it loaded correctly:
>
> [260908.194533] sfp EE: 00000000: 03 04 07 10 00 00 01 00 00 00 00 06 67 02 00 00 ............g...
> [260908.194536] sfp EE: 00000010: 1e 0f 00 00 46 69 62 65 72 53 74 6f 72 65 20 20 ....FiberStore
> [260908.194538] sfp EE: 00000020: 20 20 20 20 00 00 1b 21 53 46 50 2d 31 30 47 53 ...!SFP-10GS
> [260908.194540] sfp EE: 00000030: 52 2d 38 35 20 20 20 20 41 20 20 20 03 52 00 1f R-85 A .R..
> [260908.194541] sfp EE: 00000040: 40 63 bd df 40 8e ff ff 10 41 b0 9a ff ff ff ff @c..@....A......
> [260908.194543] sfp EE: 00000050: 10 58 5b 29 41 8e ff ff 10 41 b0 9a ff ff ff ff .X[)A....A......
> [260908.198205] sfp sfp.1025: module FiberStore SFP-10GSR-85 rev A sn G1804125607 dc 180605
>
> Since the read mechanism of I2C is to write the offset and read command
> first, and then read the target address. I think it's possible that the different
> offsets be written at the same time, from Linux and firmware.
O.K, that is bad. The SFP is totally unreliable...
You however have still not answered my question. What is the firmware
accessing? How does it handle pages?
The hack you have put in place is per i2c transaction. But accessing
pages is likely to be multiple transactions. One to change the page,
followed by a few reads/writes in the new page, then maybe followed by
a transactions to return to page 0.
I think your best solution is to simply take the mutex and never
release it. Block your firmware from accessing the SFP.
Andrew
Powered by blists - more mailing lists