[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <5c598fea.3165d.1973e0a9a3a.Coremail.chenglingfei22s@ict.ac.cn>
Date: Thu, 5 Jun 2025 11:02:51 +0800 (GMT+08:00)
From: 程凌飞 <chenglingfei22s@....ac.cn>
To: sven@...npeter.dev, j@...nau.net, alyssa@...enzweig.io, neal@...pa.dev
Cc: zhangzhenwei22b@....ac.cn, wangzhe12@....ac.cn,
chenglingfei22s@....ac.cn, maddy@...ux.ibm.com, mpe@...erman.id.au,
npiggin@...il.com, christophe.leroy@...roup.eu, naveen@...nel.org,
andi.shyti@...nel.org, asahi@...ts.linux.dev,
linux-arm-kernel@...ts.infradead.org, linuxppc-dev@...ts.ozlabs.org,
linux-i2c@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: [BUG] rmmod i2c-pasemi-platform causing kernel crash on Apple M1.
Hi, all!
We’ve encountered a kernel crash when running rmmod i2c-pasemi-platform on a Mac Mini M1 (T8103) running Asahi Arch Linux.
The bug was first found on the Linux v6.6, which is built manually with the Asahi given config to run our services.
At that time, the i2c-pasemi-platform was i2c-apple.
We noticed in the Linux v6.7, the pasemi is splitted into two separate modules, one of which is i2c-pasemi-platform.
Therefore, we built Linux v6.14.6 and tried to rmmod i2c-pasemi-platform again, the crash still exists. Moreover, we fetched
the latest i2c-pasemi-platform on linux-next(911483b25612c8bc32a706ba940738cc43299496) and asahi, built them and
tested again with Linux v6.14.6, but the crash remains.
Because kexec is not supported and will never be fully supported on Apple Silicon platforms due to hardware and firmware
design constraints, we can not record the panic logs through kdump.
Thus we tried to find the root cause of the issue manually. When we perform rmmod, the kernel performs device releasing on
the i2c bus, then calls the remove function in snd-soc-cs42l83-i2c, which calls the cs42l42_common_remove in cs42l42,
because cs42l42->init_done is true, it performs regmap_write, and finally calls into pasemi_smb_waitready in i2c-pasemi
-core.c. We noticed that smbus->use_irq is true, and after it calls into wait_for_completion_timeout, the system crashs!
We found that wait_for_completion_timeout is one of the core scheduler APIs used by tens of thousands of other drivers,
it is unlikely causing the crash. So we tried to remove the call to wait_for_completion_timeout, then the system seems to
run well.
However, because we have little knowledge about i2c devices and specifications, we are not sure whether this change will
cause other potential harms for the system and device. Is this call to wait necesary here? Or can you give a more
sophisticated fix?
We’re happy to provide additional logs, configs, or testing assistance. Any guidance would be greatly appreciated!
Powered by blists - more mailing lists