[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250626-reprint-curtly-358d13458b66@spud>
Date: Thu, 26 Jun 2025 16:40:55 +0100
From: Conor Dooley <conor@...nel.org>
To: Andi Shyti <andi.shyti@...nel.org>
Cc: Conor Dooley <conor.dooley@...rochip.com>, linux-i2c@...r.kernel.org,
Daire McNamara <daire.mcnamara@...rochip.com>,
linux-kernel@...r.kernel.org
Subject: Re: [RESEND PATCH v1] i2c: microchip-core: re-fix fake detections w/
i2cdetect
On Wed, Jun 25, 2025 at 11:50:28PM +0200, Andi Shyti wrote:
> Hi Conor,
>
> On Wed, Jun 25, 2025 at 02:27:51PM +0100, Conor Dooley wrote:
> > On Thu, Jun 12, 2025 at 12:12:49PM +0100, Conor Dooley wrote:
> > > Introducing support for smbus re-broke i2cdetect, causing it to detect
> > > devices at every i2c address, just as it did prior to being fixed in
> > > commit 49e1f0fd0d4cb ("i2c: microchip-core: fix "ghost" detections").
> > > This was caused by an oversight, where the new smbus code failed to
> > > check the return value of mchp_corei2c_xfer(). Check it, and propagate
> > > any errors.
> > >
> > > Fixes: d6ceb40538263 ("i2c: microchip-corei2c: add smbus support")
> > > Signed-off-by: Conor Dooley <conor.dooley@...rochip.com>
> > > ---
> > >
> > > Resending cos I think it attempted a send using my korg address on a
> > > network where that is not possible.
> >
> > Seemingly this patch has exposed an issue that causes a hang that was
> > not noticed previously:
> > # cd /sys/bus/iio/devices/iio\:device0
> > # cat *
> > VDDREG_IBUS_1
> > 0
> > [ 92.178899] ------------[ cut here ]------------
> > [ 92.178921] WARNING: CPU: 0 PID: 291 at kernel/workqueue.c:2496 __queue_delayed_work+0xb4/0xbe
> > [ 92.178981] Modules linked in: pac1934 industrialio autofs4
> > [ 92.179024] CPU: 0 UID: 0 PID: 291 Comm: cat Not tainted 6.12.22 #1
> > [ 92.179045] Hardware name: Microchip PolarFire-SoC Icicle Kit (DT)
> > [ 92.179055] epc : __queue_delayed_work+0xb4/0xbe
> > [ 92.179081] ra : mod_delayed_work_on+0x4a/0xa6
> > [ 92.179107] epc : ffffffff8002b2f0 ra : ffffffff8002b3ba sp : ffffffc600863b70
> > [ 92.179122] gp : ffffffff812d6068 tp : ffffffe5a5bc24c0 t0 : 0000000000000000
> > [ 92.179136] t1 : 000000000000ff00 t2 : ffffffff80c01210 s0 : ffffffc600863b90
> > [ 92.179150] s1 : ffffffe5a2aa7568 a0 : 0000000000000040 a1 : ffffffe5a2054000
> > [ 92.179164] a2 : ffffffe5a2aa7568 a3 : 0000000000003a98 a4 : 0000000000000000
> > [ 92.179178] a5 : ffffffff8002b1fa a6 : 0000000000000000 a7 : 0000000000000000
> > [ 92.179191] s2 : 0000000000000040 s3 : ffffffe5a2054000 s4 : 0000000000003a98
> > [ 92.179205] s5 : 0000000000000000 s6 : ffffffc600863c5c s7 : ffffffc600863c58
> > [ 92.179219] s8 : 0000000000400cc0 s9 : fffffffffffff000 s10: 000000007ffff000
> > [ 92.179233] s11: ffffffe5a2806128 t3 : 0000000000ff0000 t4 : 0000000000000000
> > [ 92.179247] t5 : 0000000000000000 t6 : 0000000000000000
> > [ 92.179258] status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
> > [ 92.179273] [<ffffffff8002b2f0>] __queue_delayed_work+0xb4/0xbe
> > [ 92.179302] [<ffffffff8002b3ba>] mod_delayed_work_on+0x4a/0xa6
> > [ 92.179331] [<ffffffff01375980>] pac1934_read_raw+0xba/0x1fc [pac1934]
> > [ 92.179392] [<ffffffff0134942e>] iio_read_channel_info+0xae/0xc4 [industrialio]
> > [ 92.179704] [<ffffffff80442fde>] dev_attr_show+0x14/0x46
> > [ 92.179748] [<ffffffff8020ca3e>] sysfs_kf_seq_show+0x6c/0xe2
> > [ 92.179788] [<ffffffff8020b15c>] kernfs_seq_show+0x18/0x20
> > [ 92.179814] [<ffffffff801bc1ca>] seq_read_iter+0xd0/0x328
> > [ 92.179844] [<ffffffff8020b634>] kernfs_fop_read_iter+0xfa/0x15c
> > [ 92.179871] [<ffffffff8018e128>] vfs_read+0x1aa/0x25a
> > [ 92.179895] [<ffffffff8018e998>] ksys_read+0x5a/0xcc
> > [ 92.179915] [<ffffffff8018ea1e>] __riscv_sys_read+0x14/0x1c
> > [ 92.179936] [<ffffffff807662f8>] do_trap_ecall_u+0x1ac/0x1d8
> > [ 92.179981] [<ffffffff8076e3ba>] handle_exception+0x146/0x152
> > [ 92.180018] ---[ end trace 0000000000000000 ]---
> > (The issue was detected on an internal 6.12 based branch, but the
> > content there is identical to what's in mainline + this patch).
> >
> > Obviously adding the check for an error here doesn't cause there to be
> > problems with a transfer, only actually report problems that have
> > occurred. I have not yet been able to reproduce this on my setup, but
> > the reporter saw the issues going away when they disabled hardware smbus
> > support and used software emulation instead.
> >
> > I don't know if this has any bearing on applying the patch, but I wanted
> > to mention it for reasons that should be apparent. I'm looking into the
> > issue still and hope to figure out what's going wrong.
>
> Thanks for letting me know. I'll hold off on this for now. Please
> ping me once you have some results.
Well, I made a mistake in my patch, so I'll be sending a v2. I still do
not know what is causing there to be issues with corruption of the
workqueue in the client driver. Nothing really makes sense, as the bug
in my patch causes read date to never get populated into the rx data
array rather than writing to something invalid. There's nothing obvious
in the client driver either that is blindly writing into some address
based on the result (there are some reads indexed by it). I'll continue
to look at that more, but I have a v2 that fixes the patch here.
Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)
Powered by blists - more mailing lists