[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1ffa28bf6e3dcde83a6a6a5dde163596c4db639d.camel@mediatek.com>
Date: Tue, 26 Aug 2025 07:58:47 +0000
From: Xion Wang (王鑫) <Xion.Wang@...iatek.com>
To: "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-mediatek@...ts.infradead.org" <linux-mediatek@...ts.infradead.org>,
wsd_upstream <wsd_upstream@...iatek.com>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
Huadian Liu (刘华典) <huadian.liu@...iatek.com>,
"matthias.bgg@...il.com" <matthias.bgg@...il.com>, "arnd@...db.de"
<arnd@...db.de>, AngeloGioacchino Del Regno
<angelogioacchino.delregno@...labora.com>
Subject: Re: [PATCH 1/1] misc: Prevent double registration and deregistration
of miscdevice
On Tue, 2025-08-26 at 09:05 +0200, gregkh@...uxfoundation.org wrote:
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>
>
> On Tue, Aug 26, 2025 at 02:55:59AM +0000, Xion Wang (王鑫) wrote:
> > On Mon, 2025-08-25 at 22:28 +0200, Greg Kroah-Hartman wrote:
> > > External email : Please do not click links or open attachments
> > > until
> > > you have verified the sender or the content.
> > >
> > >
> > > On Mon, Aug 25, 2025 at 04:45:47PM +0800, xion.wang@...iatek.com
> > > wrote:
> > > > From: Xion Wang <xion.wang@...iatek.com>
> > > >
> > > > When repeated calls to misc_register() or misc_deregister() on
> > > > the
> > > > same miscdevice could lead to kernel crashes or misc_list
> > > > corruption due to
> > > > multiple INIT_LIST_HEAD or list_del operations on the same list
> > > > node.
> > > >
> > > > This patch improves the robustness of the misc device driver by
> > > > preventing
> > > > both double registration and double deregistration of
> > > > miscdevice
> > > > instances.
> > > >
> > > > Signed-off-by: Xion Wang <xion.wang@...iatek.com>
> > > > ---
> > > > drivers/char/misc.c | 7 ++++++-
> > > > 1 file changed, 6 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/char/misc.c b/drivers/char/misc.c
> > > > index 558302a64dd9..2f8666312966 100644
> > > > --- a/drivers/char/misc.c
> > > > +++ b/drivers/char/misc.c
> > > > @@ -210,6 +210,9 @@ int misc_register(struct miscdevice *misc)
> > > > int err = 0;
> > > > bool is_dynamic = (misc->minor == MISC_DYNAMIC_MINOR);
> > > >
> > > > + if (WARN_ON(misc->this_device))
> > > > + return -EEXIST;
> > >
> > > You just crashed the kernel if this ever triggers (remember when
> > > panic-on-warn is set)
> > >
> > > So please, if this can happen, properly handle it.
> > >
> > > > +
> > > > INIT_LIST_HEAD(&misc->list);
> > > >
> > > > mutex_lock(&misc_mtx);
> > > > @@ -251,6 +254,7 @@ int misc_register(struct miscdevice *misc)
> > > > misc->minor = MISC_DYNAMIC_MINOR;
> > > > }
> > > > err = PTR_ERR(misc->this_device);
> > > > + misc->this_device = NULL;
> > > > goto out;
> > > > }
> > > >
> > > > @@ -275,12 +279,13 @@ EXPORT_SYMBOL(misc_register);
> > > >
> > > > void misc_deregister(struct miscdevice *misc)
> > > > {
> > > > - if (WARN_ON(list_empty(&misc->list)))
> > > > + if (WARN_ON(!misc->this_device))
> > > > return;
> > > >
> > > > mutex_lock(&misc_mtx);
> > > > list_del(&misc->list);
> > > > device_destroy(&misc_class, MKDEV(MISC_MAJOR, misc-
> > > > >minor));
> > > > + misc->this_device = NULL;
> > >
> > > You are overloading the pointer here to mean something, please
> > > don't.
> > >
> > > Again, why would this ever happen? What in-tree driver does
> > > this?
> > >
> > > thanks,
> > >
> > > greg k-h
> >
> >
> > This issue was encountered during MTK internal stress testing,
> > specifically in the WiFi module on/off process. If the WiFi module
> > fails during the "on" process, it triggers an "off" process.
> > However,
> > if the "off" process also fails, the module may not be properly
> > deinitialized, and the misc device may not be correctly
> > deregistered.
> > On the next WiFi "on" attempt, repeated registration of the misc
> > device
> > leads to corruption of the misc_list. Subsequently, when a device
> > calls
> > misc_open, it may acquire the misc_lock and enter an infinite loop
> > in
> > list_for_each_entry due to the corrupted list, while other threads
> > attempting to access their misc device nodes become blocked waiting
> > for
> > the misc_lock.
>
> What driver is this? And wifi devices should be using the rfkill
> api,
> which only registers a misc device at module load time. A wifi
> driver
> should not be creating a miscdevice itself, that feels very very
> wrong.
>
> > This scenario exposes two issues:
> >
> > Incomplete failure handling in our internal WiFi module's on/off
> > process (which we have already addressed internally).
> > The lack of a mechanism in the miscdevice framework to prevent
> > repeated
> > registration or deregistration, which would improve its robustness.
>
> Again, this shouldn't be something that any driver should hit as this
> usage is not in the kernel tree that I can see. Attempting to
> re-register a device multiple times is normally never a good idea.
>
>
Thank you for your comments.
I am not the owner of the WiFi driver and do not have full details of
its internal logic. However, during internal integration and stress
testing, we observed an issue where repeated registration and
deregistration of a misc device by the WiFi module led to corruption of
the misc_list. While I cannot provide the exact reasoning behind the
WiFi driver's design, I wanted to report the problem and share our
findings with the community in case similar patterns exist elsewhere,
including in vendor or out-of-tree drivers.
Below is a real log excerpt showing the kernel panic caused by repeated
misc device registration:
[21998.088014].(7)[T770 ] mtk_wland_threa:
[wlan][770]wlanOnAtReset:(INIT DEBUG) reset success
[22001.380742].(4)[T776 ] wlan_rst_thread: [WIFI-FW]
mtk_wcn_wlan_func_ctrl[E]: WiFi on/off takes more than 15 seconds
[22001.382096].(4)[T776 ] wlan_rst_thread: [MTK-WIFI]
wifi_reset_end[E]: WMT turn on WIFI fail!
[22002.194944].(4)[T770 ] mtk_wland_threa:
[wlan][770]kalIoctlByBssIdx:(OID WARN) Driver is resetting.
[22002.194949].(4)[T770 ] mtk_wland_threa:
[wlan][770]wlanOnAtReset:(REQ WARN) disassociate error:c0010011
[22002.194953].(4)[T770 ] mtk_wland_threa:
[wlan][770]hifAxiProbe:(INIT ERROR) pfWlanProbe fail!
[22002.194958].(4)[T770 ] mtk_wland_threa:
[wlan][770]wlanFuncOnImpl:(HAL ERROR) glBusFuncOn failed, ret=-1
[22002.970751].(5)[T776 ] wlan_rst_thread:
[wlan][776]glResetMsgHandler:(INIT WARN) WF chip reset start!
[22002.985825].(6)[T776 ] wlan_rst_thread:
[wlan][776]glResetMsgHandler:(INIT WARN) WF chip reset end!
[22002.990568].(6)[T776 ] wlan_rst_thread: [MTK-WIFI]
wifi_reset_end[W]: WIFI is already off.
[22016.162096].(4)[T776 ] wlan_rst_thread: [MTK-WIFI]
wifi_reset_end[W]: WIFI state recovering...
[22016.163209].(4)[T776 ] wlan_rst_thread: [MTK-WIFI]
wifi_reset_end[W]: WIFI is already off.
[22025.798095].(7)[T4045 ] binder:4037_3: [MTK-WIFI] WIFI_write[I]:
WIFI_write 1, length 2, copy_size 2
[22025.827948].(5)[T770 ] mtk_wland_threa: [wlan][770]wlanProbe:(INIT
DEBUG) enter wlanProbe
[22053.185845].(6)[T770 ] mtk_wland_threa: sysfs: cannot create
duplicate filename '/devices/virtual/misc/wlan'
[22053.187114].(6)[T770 ] mtk_wland_threa: CPU: 6 UID: 0 PID: 770
Comm: mtk_wland_threa Tainted: G OE 6.12.30-android16-5-
gc0222621e3b8-4k #1 c67b8daf991e21c8488e3ba1448fdac319baaf99
[22053.187118].(6)[T770 ] mtk_wland_threa: Tainted: [O]=OOT_MODULE,
[E]=UNSIGNED_MODULE
[22053.187119].(6)[T770 ] mtk_wland_threa: Hardware name: MT6858(ENG)
(DT)
[22053.187121].(6)[T770 ] mtk_wland_threa: Call trace:
[22053.187122].(6)[T770 ] mtk_wland_threa: dump_backtrace+0xf8/0x178
[22053.187130].(6)[T770 ] mtk_wland_threa: show_stack+0x18/0x28
[22053.187133].(6)[T770 ] mtk_wland_threa: dump_stack_lvl+0x40/0x88
[22053.187136].(6)[T770 ] mtk_wland_threa: dump_stack+0x18/0x24
[22053.187139].(6)[T770 ] mtk_wland_threa: sysfs_warn_dup+0x6c/0xc8
[22053.187143].(6)[T770 ]
mtk_wland_threa: sysfs_create_dir_ns+0xa0/0xf8
[22053.187145].(6)[T770 ]
mtk_wland_threa: kobject_add_internal+0x1b8/0x47c
[22053.187148].(6)[T770 ] mtk_wland_threa: kobject_add+0x90/0x104
[22053.187151].(6)[T770 ] mtk_wland_threa: device_add+0x1a4/0x494
[22053.187154].(6)[T770 ]
mtk_wland_threa: device_create_groups_vargs+0xd4/0x168
[22053.187156].(6)[T770 ]
mtk_wland_threa: device_create_with_groups+0x48/0x74
[22053.187158].(6)[T770 ] mtk_wland_threa: misc_register+0x1b8/0x28c
[22053.187160].(6)[T770 ]
mtk_wland_threa: kalWlanUeventInit+0x3c/0xfc [wlan_drv_gen4m_6858]
[22053.187864].(6)[T770 ] mtk_wland_threa: wlanProbe+0x60c/0x830
[wlan_drv_gen4m_6858]
[22053.188556].(6)[T770 ] mtk_wland_threa: glBusFuncOn+0x50/0xfc
[wlan_drv_gen4m_6858]
[22053.189246].(6)[T770 ] mtk_wland_threa: wlanFuncOnImpl+0x78/0x140
[wlan_drv_gen4m_6858]
[22053.189934].(6)[T770 ] mtk_wland_threa: wlanFuncOn+0x6c/0x128
[wlan_drv_gen4m_6858]
[22053.190622].(6)[T770 ]
mtk_wland_threa: wlan_func_on_by_chrdev+0x98/0x108
[wlan_drv_gen4m_6858]
[22053.191307].(6)[T770 ]
mtk_wland_threa: mtk_wland_thread_main+0x98/0x334
[wmt_chrdev_wifi_connac2]
[22053.191318].(6)[T770 ] mtk_wland_threa: kthread+0x110/0x1a4
[22053.191321].(6)[T770 ] mtk_wland_threa: ret_from_fork+0x10/0x20
[22053.218071].(6)[T770 ] mtk_wland_threa: kobject:
kobject_add_internal failed for wlan with -EEXIST, don't try to
register things with the same name in the same directory.
[22053.220025].(6)[T770 ] mtk_wland_threa:
[wlan][770]kalWlanUeventInit:(INIT WARN) misc_register error:-17
I understand that this usage is not typical for in-tree drivers, and
that robust failure handling is expected.
Regarding the robustness check in misc_deregister(), the current
WARN_ON(list_empty(&misc->list)) does not fully prevent errors such as
deregistering an unregistered or already deregistered device. If the
community feels such checks are unnecessary, I am open to removing or
revising this logic as advised.
Thank you again for your guidance and feedback.
Powered by blists - more mailing lists