[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6064d018.2b279.19740a7eb1c.Coremail.chenglingfei22s@ict.ac.cn>
Date: Thu, 5 Jun 2025 23:13:55 +0800 (GMT+08:00)
From: chenglingfei <chenglingfei22s@....ac.cn>
To: "Sven Peter" <sven@...nel.org>
Cc: j@...nau.net, alyssa@...enzweig.io, neal@...pa.dev,
zhangzhenwei22b@....ac.cn, chenglingfei22s@....ac.cn,
wangzhe12@....ac.cn, maddy@...ux.ibm.com, mpe@...erman.id.au,
npiggin@...il.com, christophe.leroy@...roup.eu, naveen@...nel.org,
andi.shyti@...nel.org, asahi@...ts.linux.dev,
linux-arm-kernel@...ts.infradead.org, linuxppc-dev@...ts.ozlabs.org,
linux-i2c@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: Re: [BUG] rmmod i2c-pasemi-platform causing kernel crash on
Apple M1.
> -----原始邮件-----
> 发件人: "Sven Peter" <sven@...nel.org>
> 发送时间: 2025-06-05 22:02:35 (星期四)
> 收件人: chenglingfei <chenglingfei22s@....ac.cn>
> 抄送: j@...nau.net, alyssa@...enzweig.io, neal@...pa.dev, zhangzhenwei22b@....ac.cn, wangzhe12@....ac.cn, maddy@...ux.ibm.com, mpe@...erman.id.au, npiggin@...il.com, christophe.leroy@...roup.eu, naveen@...nel.org, andi.shyti@...nel.org, asahi@...ts.linux.dev, linux-arm-kernel@...ts.infradead.org, linuxppc-dev@...ts.ozlabs.org, linux-i2c@...r.kernel.org, linux-kernel@...r.kernel.org
> 主题: Re: [BUG] rmmod i2c-pasemi-platform causing kernel crash on Apple M1.
>
> Hi,
>
> On 05.06.25 13:55, chenglingfei wrote:
> >
> >
> >
> > > -----原始邮件-----
> > > 发件人: "Sven Peter" <sven@...nel.org>
> > > 发送时间: 2025-06-05 18:25:09 (星期四)
> > > 收件人: 程凌飞 <chenglingfei22s@....ac.cn>, j@...nau.net, alyssa@...enzweig.io, neal@...pa.dev
> > > 抄送: zhangzhenwei22b@....ac.cn, wangzhe12@....ac.cn, maddy@...ux.ibm.com, mpe@...erman.id.au, npiggin@...il.com, christophe.leroy@...roup.eu, naveen@...nel.org, andi.shyti@...nel.org, asahi@...ts.linux.dev, linux-arm-kernel@...ts.infradead.org, linuxppc-dev@...ts.ozlabs.org, linux-i2c@...r.kernel.org, linux-kernel@...r.kernel.org
> > > 主题: Re: [BUG] rmmod i2c-pasemi-platform causing kernel crash on Apple M1.
> > >
> > > Hi,
> > >
> > > On 05.06.25 05:02, 程凌飞 wrote:
> > > > Hi, all!
> > > >
> > > > We’ve encountered a kernel crash when running rmmod i2c-pasemi-platform on a Mac Mini M1 (T8103) running Asahi Arch Linux.
> > > >
> > > > The bug was first found on the Linux v6.6, which is built manually with the Asahi given config to run our services.
> > > > At that time, the i2c-pasemi-platform was i2c-apple.
> > > >
> > > > We noticed in the Linux v6.7, the pasemi is splitted into two separate modules, one of which is i2c-pasemi-platform.
> > > > Therefore, we built Linux v6.14.6 and tried to rmmod i2c-pasemi-platform again, the crash still exists. Moreover, we fetched
> > > > the latest i2c-pasemi-platform on linux-next(911483b25612c8bc32a706ba940738cc43299496) and asahi, built them and
> > > > tested again with Linux v6.14.6, but the crash remains.
> > > >
> > > > Because kexec is not supported and will never be fully supported on Apple Silicon platforms due to hardware and firmware
> > > > design constraints, we can not record the panic logs through kdump.
> > >
> > > Do you have UART connected to a device under test which you could use to
> > > grab the panic log from the kernel? Alternatively you can also run the
> > > kernel under m1n1's hypervisor and grab the log that way. It'll emulate
> > > the serial port and redirect its output via USB.
> > >
> >
> > I don't have UART, but I have tried to run the kernel under m1n1's hypervisor. However, it does not trigger the release of cs42l83.
> > Given that m1n1 provides full peripheral device emulation capability, the most plausible explanation would be an incorrect
> > firmware loading sequence. But the documentation of Asahi provides little details about how to generate an initramfs with
> > firmware (I think), can you give more guidance about it?
>
> I'm not sure why you are even trying to create a special initramfs. Just
> load your usual kernel using the usual boot flow as a guest. There's
> also no firmware involved in i2c and I'm not sure what you mean with
> "full peripheral device emulation" either or how that's related to firmware.
> You also mention that the crash happens when you run rmmod so I again
> don't understand what "it does not trigger the release of cs42l83" means
> here.
>
Well, simply running rmmod i2c-pasemi-platform doesn't directly cause a crash.
The crash occurs when the module removal triggers device_remove for cs42l83,
which ultimately calls pasemi_smb_waitready in i2c-pasemi-platform. You may refer
to the brief analysis provided in my first email for more details.
When booting the kernel without m1n1, cs42l83 is automatically probed after
i2c-pasemi-platform loads and subsequently removed when executing rmmod
i2c-pasemi-platform, resulting in a kernel crash. However, when booting under m1n1,
cs42l83 isn't probed or removed -- the device appears to be non-existent. This
observation led me to mention "full peripheral device emulation."
Furthermore, since cs42l83 remains untouched under m1n1, the chain of operations
involving device_remove and the subsequent call to pasemi_smb_waitready never occurs.
This inherently prevents the crash scenario, which explains why I'm unable to reproduce
the crash when running under m1n1.
I can try again by 'loading your usual kernel using the usual boot flow as a guest,',
but I don't think it'll make much difference.
> >
> > > >
> > > > Thus we tried to find the root cause of the issue manually. When we perform rmmod, the kernel performs device releasing on
> > > > the i2c bus, then calls the remove function in snd-soc-cs42l83-i2c, which calls the cs42l42_common_remove in cs42l42,
> > > > because cs42l42->init_done is true, it performs regmap_write, and finally calls into pasemi_smb_waitready in i2c-pasemi
> > > > -core.c. We noticed that smbus->use_irq is true, and after it calls into wait_for_completion_timeout, the system crashs!>
> > > > We found that wait_for_completion_timeout is one of the core scheduler APIs used by tens of thousands of other drivers,
> > > > it is unlikely causing the crash. So we tried to remove the call to wait_for_completion_timeout, then the system seems to
> > > > run well.
> > > >
> > > > However, because we have little knowledge about i2c devices and specifications, we are not sure whether this change will
> > > > cause other potential harms for the system and device. Is this call to wait necesary here? Or can you give a more
> > > > sophisticated fix?
> > >
> > > Yes, that call is necessary. It waits for the "transfer completed"
> > > interrupt from the hardware. Without it the driver will try to read data
> > > before it's available and you'll see corruption. I'm surprised hardware
> > > attached to i2c (usb pd controller and audio I think) works at all with
> > > that change.
> > >
> > >
> > > Sven
> >
> > Are there any methods or tools to systematically verify its functionality? I am not sure whether the devices attached to i2c
> > should work well even after the i2c-pasemi-platform has been removed.
>
> I don't understand. You say you saw a crash inside pasemi_smb_waitready
> when calling wait_for_completion_timeout and decided to remove that
> method. When you remove the call you break the entire driver because it
> will now try to read data long before the i2c transaction has been
> completed.
> Obviously, no i2c device will work when the driver isn't loaded but
> without waiting for the completion they also won't work when the driver
> is loaded.
>
>
> Sven
</chenglingfei22s@....ac.cn></sven@...nel.org></chenglingfei22s@....ac.cn></sven@...nel.org>
Powered by blists - more mailing lists