lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9accd63a-961c-4dab-e167-9e2654917a94@gmail.com>
Date:   Mon, 16 Aug 2021 09:33:03 -0500
From:   Frank Rowand <frowand.list@...il.com>
To:     Vladimir Oltean <olteanv@...il.com>,
        Rob Herring <robh+dt@...nel.org>,
        Sascha Hauer <s.hauer@...gutronix.de>,
        devicetree@...r.kernel.org, linux-kernel@...r.kernel.org,
        netdev@...r.kernel.org
Subject: Re: of_node_put() usage is buggy all over drivers/of/base.c?!

Hi Vladimir,

On 8/13/21 8:01 PM, Vladimir Oltean wrote:
> Hi,
> 
> I was debugging an RCU stall which happened during the probing of a
> driver. Activating lock debugging, I see:

I took a quick look at sja1105_mdiobus_register() in v5.14-rc1 and v5.14-rc6.

Looking at the following stack trace, I did not see any calls to
of_find_compatible_node() in sja1105_mdiobus_register().  I am
guessing that maybe there is an inlined function that calls
of_find_compatible_node().  This would likely be either
sja1105_mdiobus_base_tx_register() or sja1105_mdioux_base_t1_register().

> 
> [  101.710694] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:938
> [  101.719119] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1534, name: sh
> [  101.726763] INFO: lockdep is turned off.
> [  101.730674] irq event stamp: 0
> [  101.733716] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
> [  101.739973] hardirqs last disabled at (0): [<ffffd3ebecb10120>] copy_process+0xa78/0x1a98
> [  101.748146] softirqs last  enabled at (0): [<ffffd3ebecb10120>] copy_process+0xa78/0x1a98
> [  101.756313] softirqs last disabled at (0): [<0000000000000000>] 0x0
> [  101.762569] CPU: 4 PID: 1534 Comm: sh Not tainted 5.14.0-rc5+ #272
> [  101.774558] Call trace:
> [  101.794734]  __might_sleep+0x50/0x88
> [  101.798297]  __mutex_lock+0x60/0x938
> [  101.801863]  mutex_lock_nested+0x38/0x50
> [  101.805775]  kernfs_remove+0x2c/0x50             <---- this takes mutex_lock(&kernfs_mutex);
> [  101.809341]  sysfs_remove_dir+0x54/0x70

The __kobject_del() occurs only if the refcount on the node
becomes zero.  This should never be true when of_find_compatible_node()
calls of_node_put() unless a "from" node is passed to of_find_compatible_node().

In both sja1105_mdiobus_base_tx_register() and sja1105_mdioux_base_t1_register()
a from node ("mdio") is passed to of_find_compatible_node() without first doing an
of_node_get(mdio).  If you add the of_node_get() calls the problem should be fixed.

-Frank


> [  101.813166]  __kobject_del+0x3c/0x80
> [  101.816733]  kobject_put+0xf8/0x108
> [  101.820211]  of_node_put+0x18/0x28
> [  101.823602]  of_find_compatible_node+0xa8/0xf8    <--- this takes raw_spin_lock_irqsave(&devtree_lock)
> [  101.828036]  sja1105_mdiobus_register+0x264/0x7a8
> 
> The pattern of calling of_node_put from under the atomic devtree_lock
> context is pretty widespread in drivers/of/base.c.
> 
> Just by inspecting the code, this seems to be an issue since commit:
> 
> commit 75b57ecf9d1d1e17d099ab13b8f48e6e038676be
> Author: Grant Likely <grant.likely@...aro.org>
> Date:   Thu Feb 20 18:02:11 2014 +0000
> 
>     of: Make device nodes kobjects so they show up in sysfs
> 
>     Device tree nodes are already treated as objects, and we already want to
>     expose them to userspace which is done using the /proc filesystem today.
>     Right now the kernel has to do a lot of work to keep the /proc view in
>     sync with the in-kernel representation. If device_nodes are switched to
>     be kobjects then the device tree code can be a whole lot simpler. It
>     also turns out that switching to using /sysfs from /proc results in
>     smaller code and data size, and the userspace ABI won't change if
>     /proc/device-tree symlinks to /sys/firmware/devicetree/base.
> 
>     v7: Add missing sysfs_bin_attr_init()
>     v6: Add __of_add_property() early init fixes from Pantelis
>     v5: Rename firmware/ofw to firmware/devicetree
>         Fix updating property values in sysfs
>     v4: Fixed build error on Powerpc
>         Fixed handling of dynamic nodes on powerpc
>     v3: Fixed handling of duplicate attribute and child node names
>     v2: switch to using sysfs bin_attributes which solve the problem of
>         reporting incorrect property size.
> 
>     Signed-off-by: Grant Likely <grant.likely@...retlab.ca>
>     Tested-by: Sascha Hauer <s.hauer@...gutronix.de>
>     Cc: Rob Herring <rob.herring@...xeda.com>
>     Cc: Benjamin Herrenschmidt <benh@...nel.crashing.org>
>     Cc: David S. Miller <davem@...emloft.net>
>     Cc: Nathan Fontenot <nfont@...ux.vnet.ibm.com>
>     Cc: Pantelis Antoniou <panto@...oniou-consulting.com>
> 
> because up until that point, of_node_put() was:
> 
> void of_node_put(struct device_node *node)
> {
> 	if (node)
> 		kref_put(&node->kref, of_node_release);
> }
> 
> and not:
> 
> void of_node_put(struct device_node *node)
> {
> 	if (node)
> 		kobject_put(&node->kobj);
> }
> 
> Either I'm holding it very, very wrong, or this is a very severe
> oversight that just happened somehow to go unnoticed for 7 years.
> 
> Please tell me it's me.
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ