lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6b89a9e1-e92e-ca99-9fbd-1d98f6a7864b@bang-olufsen.dk>
Date:   Wed, 18 Aug 2021 10:18:06 +0000
From:   Alvin Šipraga <ALSI@...g-olufsen.dk>
To:     Saravana Kannan <saravanak@...gle.com>,
        Vladimir Oltean <olteanv@...il.com>
CC:     Vladimir Oltean <vladimir.oltean@....com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Jakub Kicinski <kuba@...nel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Florian Fainelli <f.fainelli@...il.com>,
        Andrew Lunn <andrew@...n.ch>,
        Vivien Didelot <vivien.didelot@...il.com>,
        Frank Rowand <frowand.list@...il.com>,
        Rob Herring <robh+dt@...nel.org>
Subject: Re: [PATCH net] net: dsa: sja1105: fix use-after-free after calling
 of_find_compatible_node, or worse

Hi Saravana,

On 8/18/21 4:46 AM, Saravana Kannan wrote:
> On Tue, Aug 17, 2021 at 3:31 PM Vladimir Oltean <olteanv@...il.com> wrote:
>>
>> Hi Alvin,
>>
>> On Tue, Aug 17, 2021 at 09:25:28PM +0000, Alvin Šipraga wrote:
>>> I have an observation that's slightly out of the scope of your patch,
>>> but I'll post here on the off chance that you find it relevant.
>>> Apologies if it's out of place.
>>>
>>> Do these integrated NXP PHYs use a specific PHY driver, or do they just
>>> use the Generic PHY driver?
>>
>> They refuse to probe at all with the Generic PHY driver. I have been
>> caught off guard a few times now when I had a kernel built with
>> CONFIG_NXP_C45_TJA11XX_PHY=n and their probing returns -22 in that case.
>>
>>> If the former is the case, do you experience that the PHY driver fails
>>> to get probed during mdiobus registration if the kernel uses
>>> fw_devlink=on?
>>
>> I don't test with "fw_devlink=on" in /proc/cmdline, this is the first
>> time I do it. It behaves exactly as you say.
>>
>>>
>>> In my case I am writing a new subdriver for realtek-smi, a DSA driver
>>> which registers an internal MDIO bus analogously to sja1105, which is
>>> why I'm asking. I noticed a deferred probe of the PHY driver because the
>>> supplier (ethernet-switch) is not ready - presumably because all of this
>>> is happening in the probe of the switch driver. See below:
>>>
>>> [   83.653213] device_add:3270: device: 'SMI-0': device_add
>>> [   83.653905] device_pm_add:136: PM: Adding info for No Bus:SMI-0
>>> [   83.654055] device_add:3270: device: 'platform:ethernet-switch--mdio_bus:SMI-0': device_add
>>> [   83.654224] device_link_add:843: mdio_bus SMI-0: Linked as a sync state only consumer to ethernet-switch
>>> [   83.654291] libphy: SMI slave MII: probed
>>> ...
>>> [   83.659809] device_add:3270: device: 'SMI-0:00': device_add
>>> [   83.659883] bus_add_device:447: bus: 'mdio_bus': add device SMI-0:00
>>> [   83.659970] device_pm_add:136: PM: Adding info for mdio_bus:SMI-0:00
>>> [   83.660122] device_add:3270: device: 'platform:ethernet-switch--mdio_bus:SMI-0:00': device_add
>>> [   83.660274] devices_kset_move_last:2701: devices_kset: Moving SMI-0:00 to end of list
>>> [   83.660282] device_pm_move_last:203: PM: Moving mdio_bus:SMI-0:00 to end of list
>>> [   83.660293] device_link_add:859: mdio_bus SMI-0:00: Linked as a consumer to ethernet-switch
>>> [   83.660350] __driver_probe_device:736: bus: 'mdio_bus': __driver_probe_device: matched device SMI-0:00 with driver RTL8365MB-VC Gigabit Ethernet
>>> [   83.660365] device_links_check_suppliers:1001: mdio_bus SMI-0:00: probe deferral - supplier ethernet-switch not ready
>>> [   83.660376] driver_deferred_probe_add:138: mdio_bus SMI-0:00: Added to deferred list
>>
>> So it's a circular dependency? Switch cannot finish probing because it
>> cannot connect to PHY, which cannot probe because switch has not
>> finished probing, which....
> 
> Hi Vladimir/Alvin,
> 
> If there's a cyclic dependency between two devices, then fw_devlink=on
> is smart enough to notice that. Once it notices a cycle, it knows that
> it can't tell which one is the real dependency and which one is the
> false dependency and so stops enforcing ordering between the devices
> in the cycle.
> 
> But fw_devlink doesn't understand all the properties yet. Just most of
> them and I'm always trying to add more. So when it only understands
> the property that's causing the false dependency but not the property
> that causes the real dependency, it can cause issues like this where
> fw_devlink=on enforces the false dependency and the driver/code
> enforces the real dependency. These are generally easy to fix -- you
> just need to teach fw_devlink how to parse more properties.
> 
> This is just a preliminary analysis since I don't have all the info
> yet -- so I could be wrong. With that said, I happened to be working
> on adding fw_devlink support for phy-handle property and I think it
> should fix your issue with fw_devlink=on. Can you give [1] a shot?

I tried [1] but it did not seem to have any effect.

> 
> If it doesn't fix it, can one of you please point me to an upstream
> dts (not dtsi) file for a platform in which you see this issue? And
> ideally also the DT nodes and their drivers that are involved in this
> cycle? With that info, I should be able to root cause this if the
> patch above doesn't already fix it.

I'm working with a non-upstream dts - maybe Vladimir is using an 
upstream one? The pattern among the drivers/dts is common between our 
two cases.

But for the sake of this discussion, my dts is pretty much the same as 
what you will find in arch/arm/boot/dts/gemini-dlink-dir-685.dts. The 
nodes of interest from that dts file are below, and the driver is in 
drivers/net/ds/{realtek-smi-core.c,rtl8366rb.c}. It's expected that the 
Realtek PHY driver in drivers/net/phy/realtek.c will get probed as part 
of the mdiobus registration, but that never happens. See my previous 
reply for a debug log.

/ {
	switch {
		compatible = "realtek,rtl8366rb";
		mdc-gpios = <&gpio0 21 GPIO_ACTIVE_HIGH>;
		mdio-gpios = <&gpio0 22 GPIO_ACTIVE_HIGH>;
		reset-gpios = <&gpio0 14 GPIO_ACTIVE_LOW>;
		realtek,disable-leds;

		switch_intc: interrupt-controller {
			/* GPIO 15 provides the interrupt */
			interrupt-parent = <&gpio0>;
			interrupts = <15 IRQ_TYPE_LEVEL_LOW>;
			interrupt-controller;
			#address-cells = <0>;
			#interrupt-cells = <1>;
		};

		ports {
			/* snip */
		};

		mdio {
			compatible = "realtek,smi-mdio";
			#address-cells = <1>;
			#size-cells = <0>;

			phy0: phy@0 {
				reg = <0>;
				interrupt-parent = <&switch_intc>;
				interrupts = <0>;
			};
			phy1: phy@1 {
				reg = <1>;
				interrupt-parent = <&switch_intc>;
				interrupts = <1>;
			};
			phy2: phy@2 {
				reg = <2>;
				interrupt-parent = <&switch_intc>;
				interrupts = <2>;
			};
			phy3: phy@3 {
				reg = <3>;
				interrupt-parent = <&switch_intc>;
				interrupts = <3>;
			};
			phy4: phy@4 {
				reg = <4>;
				interrupt-parent = <&switch_intc>;
				interrupts = <12>;
			};
		};
	};
};

Thanks for looking into this. Let me know if you need any other info or 
want something tested.

Kind regards,
Alvin

> 
>>
>> So how is it supposed to be solved then? Intuitively the 'mdio_bus SMI-0:00'
>> device should not be added to the deferred list, it should have everything
>> it needs right now (after all, it works without fw_devlink). No?
>>
>> It might be the late hour over here too, but right now I just don't
>> know. Let me add Saravana to the discussion too, he made an impressive
>> analysis recently on a PHY probing issue with mdio-mux,
> 
> Lol, thanks for the kind words.
> 
>> so the PHY
>> library probing dependencies should still be fresh in his mind, maybe he
>> has an idea what's wrong.
> 
> [1] - https://lore.kernel.org/lkml/20210818021717.3268255-1-saravanak@google.com/T/#u>> 
> Thanks,
> Saravana
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ