[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0d9145ee-7f25-429c-934e-7dd9bde31bd5@bootlin.com>
Date: Tue, 2 Sep 2025 16:42:04 +0200
From: Bastien Curutchet <bastien.curutchet@...tlin.com>
To: Lee Jones <lee@...nel.org>
Cc: Thomas Petazzoni <thomas.petazzoni@...tlin.com>,
Miquèl Raynal <miquel.raynal@...tlin.com>,
Cheng Ming Lin <linchengming884@...il.com>,
Cheng Ming Lin <chengminglin@...c.com.tw>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mfd: core: Increment of_node's refcount before linking it
to the platform device
Hi,
On 9/2/25 3:29 PM, Lee Jones wrote:
> On Wed, 20 Aug 2025, Bastien Curutchet wrote:
>
>> When an MFD device is added, a platform_device is allocated. If this
>> device is linked to a DT description, the corresponding OF node is linked
>> to the new platform device but the OF node's refcount isn't incremented.
>> As of_node_put() is called during the platform device release, it leads
>> to a refcount underflow.
>>
>> Call of_node_get() to increment the OF node's refcount when the node is
>> linked to the newly created platform device.
>>
>> Signed-off-by: Bastien Curutchet <bastien.curutchet@...tlin.com>
>> ---
>> Hi all,
>>
>> I'm currently working on a new MFD driver and I encountered some
>> underflow errors with the of_node refcount. As you can see in the logs
>> below, I reproduced the issue on a mainline driver (atmel-hlcdc):
>>
>>> # modprobe atmel-hlcdc
>>> # modprobe -r atmel-hlcdc
>>> # modprobe atmel-hlcdc
>>> # modprobe -r atmel-hlcdc
>>> [ 22.932128] OF: ERROR: of_node_release() detected bad of_node_put() on /amba_pl/atmel_sama5@...00000/dc
>>> [ 22.941586] CPU: 1 UID: 0 PID: 103 Comm: modprobe Not tainted 6.17.0-rc2-00053-gb19a97d57c15-dirty #81 NONE
>>> [ 22.941608] Hardware name: Xilinx Zynq Platform
>>> [ 22.941615] Call trace:
>>> [ 22.941626] unwind_backtrace from show_stack+0x10/0x14
>>> [ 22.941660] show_stack from dump_stack_lvl+0x54/0x68
>>> [ 22.941680] dump_stack_lvl from of_node_release+0x140/0x16c
>>> [ 22.941707] of_node_release from kobject_put+0x110/0x130
>>> [ 22.941745] kobject_put from platform_device_release+0x10/0x3c
>>> [ 22.941782] platform_device_release from device_release+0x30/0xa0
>>> [ 22.941814] device_release from kobject_put+0x88/0x130
>>> [ 22.941845] kobject_put from klist_prev+0xd4/0x16c
>>> [ 22.941879] klist_prev from device_for_each_child_reverse+0x88/0xc8
>>> [ 22.941911] device_for_each_child_reverse from devm_mfd_dev_release+0x30/0x54
>>> [ 22.941941] devm_mfd_dev_release from devres_release_all+0xb0/0x114
>>> [ 22.941974] devres_release_all from device_unbind_cleanup+0xc/0x58
>>> [ 22.942003] device_unbind_cleanup from device_release_driver_internal+0x190/0x1c4
>>> [ 22.942025] device_release_driver_internal from driver_detach+0x54/0xa0
>>> [ 22.942046] driver_detach from bus_remove_driver+0x58/0xa4
>>> [ 22.942066] bus_remove_driver from sys_delete_module+0x178/0x25c
>>> [ 22.942094] sys_delete_module from ret_fast_syscall+0x0/0x54
>>> [ 22.942116] Exception stack(0xf0a1dfa8 to 0xf0a1dff0)
>>> [ 22.942130] dfa0: 004ec438 005a0870 005a0d40 00000080 00000000 005a0d18
>>> [ 22.942144] dfc0: 004ec438 005a0870 005a0190 00000081 005a0d60 005a0870 00000001 0059f6bc
>>> [ 22.942155] dfe0: beccdb20 beccdb10 004ed1f4 b6e9eb40
>>> [ 22.942163] OF: ERROR: next of_node_put() on this node will result in a kobject warning 'refcount_t: underflow; use-after-free.'
>>> [ 23.098617] OF: ERROR: of_node_release() detected bad of_node_put() on /amba_pl/atmel_sama5@...00000/pwm
>>> [ 23.108137] CPU: 1 UID: 0 PID: 103 Comm: modprobe Not tainted 6.17.0-rc2-00053-gb19a97d57c15-dirty #81 NONE
>>> [ 23.108159] Hardware name: Xilinx Zynq Platform
>>> [ 23.108166] Call trace:
>>> [ 23.108173] unwind_backtrace from show_stack+0x10/0x14
>>> [ 23.108206] show_stack from dump_stack_lvl+0x54/0x68
>>> [ 23.108227] dump_stack_lvl from of_node_release+0x140/0x16c
>>> [ 23.108252] of_node_release from kobject_put+0x110/0x130
>>> [ 23.108288] kobject_put from platform_device_release+0x10/0x3c
>>> [ 23.108324] platform_device_release from device_release+0x30/0xa0
>>> [ 23.108354] device_release from kobject_put+0x88/0x130
>>> [ 23.108384] kobject_put from klist_prev+0xd4/0x16c
>>> [ 23.108418] klist_prev from device_for_each_child_reverse+0x88/0xc8
>>> [ 23.108450] device_for_each_child_reverse from devm_mfd_dev_release+0x30/0x54
>>> [ 23.108479] devm_mfd_dev_release from devres_release_all+0xb0/0x114
>>> [ 23.108513] devres_release_all from device_unbind_cleanup+0xc/0x58
>>> [ 23.108541] device_unbind_cleanup from device_release_driver_internal+0x190/0x1c4
>>> [ 23.108563] device_release_driver_internal from driver_detach+0x54/0xa0
>>> [ 23.108585] driver_detach from bus_remove_driver+0x58/0xa4
>>> [ 23.108605] bus_remove_driver from sys_delete_module+0x178/0x25c
>>> [ 23.108631] sys_delete_module from ret_fast_syscall+0x0/0x54
>>> [ 23.108653] Exception stack(0xf0a1dfa8 to 0xf0a1dff0)
>>> [ 23.108667] dfa0: 004ec438 005a0870 005a0d40 00000080 00000000 005a0d18
>>> [ 23.108681] dfc0: 004ec438 005a0870 005a0190 00000081 005a0d60 005a0870 00000001 0059f6bc
>>> [ 23.108691] dfe0: beccdb20 beccdb10 004ed1f4 b6e9eb40
>>> [ 23.108698] OF: ERROR: next of_node_put() on this node will result in a kobject warning 'refcount_t: underflow; use-after-free.'
>> ---
>> drivers/mfd/mfd-core.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/mfd/mfd-core.c b/drivers/mfd/mfd-core.c
>> index 76bd316a50afc5c07ff2a3303c4363b16d0bc023..7d14a1e7631ee8d5e91b228a07b2d05695e41b6e 100644
>> --- a/drivers/mfd/mfd-core.c
>> +++ b/drivers/mfd/mfd-core.c
>> @@ -131,6 +131,7 @@ static int mfd_match_of_node_to_dev(struct platform_device *pdev,
>> of_entry->np = np;
>> list_add_tail(&of_entry->list, &mfd_of_node_list);
>>
>> + of_node_get(np);
>
> Looks okay at first blush.
>
> My question would be, why isn't this required for all calls to device_set_node()?
>
Tough question, I've found drivers that do call of_node_get() before
device_set_node() and others that don't. I guess it depends on what
happens during release, but it's not always trivial to follow all the
redirections that occur at that stage ..
Best regards,
--
Bastien Curutchet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
Powered by blists - more mailing lists