[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <875f7448-8402-0c93-2a90-e1d83bb7586a@bang-olufsen.dk>
Date: Thu, 19 Aug 2021 13:42:19 +0000
From: Alvin Šipraga <ALSI@...g-olufsen.dk>
To: Saravana Kannan <saravanak@...gle.com>
CC: Vladimir Oltean <olteanv@...il.com>,
Vladimir Oltean <vladimir.oltean@....com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Jakub Kicinski <kuba@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Florian Fainelli <f.fainelli@...il.com>,
Andrew Lunn <andrew@...n.ch>,
Vivien Didelot <vivien.didelot@...il.com>,
Frank Rowand <frowand.list@...il.com>,
Rob Herring <robh+dt@...nel.org>
Subject: Re: [PATCH net] net: dsa: sja1105: fix use-after-free after calling
of_find_compatible_node, or worse
Hi Saravana,
Thanks for your lucid analysis, it's very much appreciated. Please find
my replies inline - but in summary, I don't think there is anything more
I can ask of you right now.
On 8/19/21 5:28 AM, Saravana Kannan wrote:
> On Wed, Aug 18, 2021 at 3:18 AM Alvin Šipraga <ALSI@...g-olufsen.dk> wrote:
>>
>> Hi Saravana,
>>
>> On 8/18/21 4:46 AM, Saravana Kannan wrote:
>>> On Tue, Aug 17, 2021 at 3:31 PM Vladimir Oltean <olteanv@...il.com> wrote:
>>>>
>>>> Hi Alvin,
>>>>
>>>> On Tue, Aug 17, 2021 at 09:25:28PM +0000, Alvin Šipraga wrote:
>>>>> I have an observation that's slightly out of the scope of your patch,
>>>>> but I'll post here on the off chance that you find it relevant.
>>>>> Apologies if it's out of place.
>>>>>
>>>>> Do these integrated NXP PHYs use a specific PHY driver, or do they just
>>>>> use the Generic PHY driver?
>>>>
>>>> They refuse to probe at all with the Generic PHY driver. I have been
>>>> caught off guard a few times now when I had a kernel built with
>>>> CONFIG_NXP_C45_TJA11XX_PHY=n and their probing returns -22 in that case.
>>>>
>>>>> If the former is the case, do you experience that the PHY driver fails
>>>>> to get probed during mdiobus registration if the kernel uses
>>>>> fw_devlink=on?
>>>>
>>>> I don't test with "fw_devlink=on" in /proc/cmdline, this is the first
>>>> time I do it. It behaves exactly as you say.
>>>>
>>>>>
>>>>> In my case I am writing a new subdriver for realtek-smi, a DSA driver
>>>>> which registers an internal MDIO bus analogously to sja1105, which is
>>>>> why I'm asking. I noticed a deferred probe of the PHY driver because the
>>>>> supplier (ethernet-switch) is not ready - presumably because all of this
>>>>> is happening in the probe of the switch driver. See below:
>>>>>
>>>>> [ 83.653213] device_add:3270: device: 'SMI-0': device_add
>>>>> [ 83.653905] device_pm_add:136: PM: Adding info for No Bus:SMI-0
>>>>> [ 83.654055] device_add:3270: device: 'platform:ethernet-switch--mdio_bus:SMI-0': device_add
>>>>> [ 83.654224] device_link_add:843: mdio_bus SMI-0: Linked as a sync state only consumer to ethernet-switch
>>>>> [ 83.654291] libphy: SMI slave MII: probed
>>>>> ...
>>>>> [ 83.659809] device_add:3270: device: 'SMI-0:00': device_add
>>>>> [ 83.659883] bus_add_device:447: bus: 'mdio_bus': add device SMI-0:00
>>>>> [ 83.659970] device_pm_add:136: PM: Adding info for mdio_bus:SMI-0:00
>>>>> [ 83.660122] device_add:3270: device: 'platform:ethernet-switch--mdio_bus:SMI-0:00': device_add
>>>>> [ 83.660274] devices_kset_move_last:2701: devices_kset: Moving SMI-0:00 to end of list
>>>>> [ 83.660282] device_pm_move_last:203: PM: Moving mdio_bus:SMI-0:00 to end of list
>>>>> [ 83.660293] device_link_add:859: mdio_bus SMI-0:00: Linked as a consumer to ethernet-switch
>>>>> [ 83.660350] __driver_probe_device:736: bus: 'mdio_bus': __driver_probe_device: matched device SMI-0:00 with driver RTL8365MB-VC Gigabit Ethernet
>>>>> [ 83.660365] device_links_check_suppliers:1001: mdio_bus SMI-0:00: probe deferral - supplier ethernet-switch not ready
>>>>> [ 83.660376] driver_deferred_probe_add:138: mdio_bus SMI-0:00: Added to deferred list
>>>>
>>>> So it's a circular dependency? Switch cannot finish probing because it
>>>> cannot connect to PHY, which cannot probe because switch has not
>>>> finished probing, which....
>>>
>>> Hi Vladimir/Alvin,
>>>
>>> If there's a cyclic dependency between two devices, then fw_devlink=on
>>> is smart enough to notice that. Once it notices a cycle, it knows that
>>> it can't tell which one is the real dependency and which one is the
>>> false dependency and so stops enforcing ordering between the devices
>>> in the cycle.
>>>
>>> But fw_devlink doesn't understand all the properties yet. Just most of
>>> them and I'm always trying to add more. So when it only understands
>>> the property that's causing the false dependency but not the property
>>> that causes the real dependency, it can cause issues like this where
>>> fw_devlink=on enforces the false dependency and the driver/code
>>> enforces the real dependency. These are generally easy to fix -- you
>>> just need to teach fw_devlink how to parse more properties.
>>>
>>> This is just a preliminary analysis since I don't have all the info
>>> yet -- so I could be wrong. With that said, I happened to be working
>>> on adding fw_devlink support for phy-handle property and I think it
>>> should fix your issue with fw_devlink=on. Can you give [1] a shot?
>>
>> I tried [1] but it did not seem to have any effect.
>>
>>>
>>> If it doesn't fix it, can one of you please point me to an upstream
>>> dts (not dtsi) file for a platform in which you see this issue? And
>>> ideally also the DT nodes and their drivers that are involved in this
>>> cycle? With that info, I should be able to root cause this if the
>>> patch above doesn't already fix it.
>>
>> I'm working with a non-upstream dts - maybe Vladimir is using an
>> upstream one? The pattern among the drivers/dts is common between our
>> two cases.
>
> Ideally, I can get a fully upstream example where this issue is
> happening so that I can look at the actual code that's hitting this
> issue and be sure my analysis is right.
Due to NDA issues at work, I am unable to publish the DTS right now. But
based on your analysis below, I believe that the problem I am
experiencing is exactly for the reasons you describe.
>
>>
>> But for the sake of this discussion, my dts is pretty much the same as
>> what you will find in arch/arm/boot/dts/gemini-dlink-dir-685.dts. The
>> nodes of interest from that dts file are below, and the driver is in
>> drivers/net/ds/{realtek-smi-core.c,rtl8366rb.c}. It's expected that the
>> Realtek PHY driver in drivers/net/phy/realtek.c will get probed as part
>> of the mdiobus registration, but that never happens. See my previous
>> reply for a debug log.
>
> Your DTS might be similar to this, but the driver code also matters
> for me to be sure. Anyway, I took a look at this, but my analysis
> below is going to be sketchy because I'm not looking at the actual
> code that's reproducing this issue.
The driver code I am working with is pretty much the same in the probe
path (the driver is modelled on rtl8366rb.c and hooks into
realtek-smi-core.c), so it was not a waste of your time to analyze the
upstream case. Thanks a lot.
>
> Assuming this issue actually happens with the example you pointed to
> (I don't know this yet), here's what is happening:
>
> The main problem is that the parent device switch seems to be assuming
> it's child/grandchild devices (mdiobus/PHYs) will have probed
> successfully as soon as they are added. This assumption is not true
> and can be broken for multiple reasons such as:
>
> 1. The driver for the child devices (PHYs in this case) could be
> loaded as a module after the parent (switch) is probed. So when the
> devices are added, the PHYs would not be probed.
> 2. The child devices could defer probe because one of their suppliers
> isn't ready yet. Either because of fw_devlink=on or the framework
> itself returning -EPROBE_DEFER.
> 3. The child devices could be getting probed asynchronously. So the
> device_add() would kick off a thread to probe the child devices in a
> separate thread.
>
> (2) is what is happening in this case. fw_devlink=on sees that
> "switch" implements the "switch_intc" and "switch" hasn't finished
> probing yet. So it has no way of knowing that switch_intc is actually
> ready. And even if switch_intc was registered as part of switch's
> probe() by the time the PHYs are added, switch_intc could get
> deregistered if the probe fails at a later point. So until probe()
> returns 0, fw_devlink can't be fully sure the supplier (switch_intc)
> is ready. Which is good in general because you won't have to
> forcefully unbind (if that is even handled correctly in the first
> place) the consumers of a device if it fails probe() half way through
> registering a few services.
Right, that makes perfect sense.
>
> I don't fully understand the networking frameworks, but I think
> Vladimir might have a point in his earlier reply [1]. If you can make
> the switch driver not assume its child PHYs are ready during the
> switch's probe and instead have the switch check if the PHYs are ready
> when the switch is "opened" that'd be better.
I am pretty new to the DSA subsystem, but it appears to me that DSA is
making this assumption: I don't think DSA switches have a concept of
"open", as everything is handled in the dsa_register_switch() call in a
DSA driver's probe function. And as Vladimir pointed out, this is also
where phylink_of_phy_connect() is being called. A quick look at the
other DSA drivers suggests that mv88e6xxx may also suffer from case (2)
above, so it seems like a more general issue.
>
> We can come up with hacks that'll delete the dependency that
> fw_devlink=on is trying to enforce, but IMHO the proper fix is to have
> parent drivers not assume child devices will be probed as soon as
> device_add(child) returns. That's not guaranteed at all.
You enumerated a number of reasons why, so yes, I agree that it is not
necessarily a safe assumption to make.
>
> Btw, I do know why things work when you do the module load/unload
> thing you mention in [2]. That has to do with some forced deletion of
> dependencies that happens when device_bind_driver() is called when the
> Generic PHY driver is used. The reason for why that's done is kind of
> unrelated to the issue at hand, but the comment for
> device_links_force_bind() should tell you why.
Thanks for clearing that up.
Kind regards,
Alvin
Powered by blists - more mailing lists