[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fe28c657-451c-064e-4d77-98dda8ea1f1b@arinc9.com>
Date: Sat, 15 Apr 2023 12:43:20 +0300
From: Arınç ÜNAL <arinc.unal@...nc9.com>
To: Daniel Golle <daniel@...rotopia.org>
Cc: Vladimir Oltean <olteanv@...il.com>,
Frank Wunderlich <frank-w@...lic-files.de>,
netdev <netdev@...r.kernel.org>, erkin.bozoglu@...ont.com
Subject: Re: mt7530: dsa_switch_parse_of() fails, causes probe code to run
twice
On 15.04.2023 10:53, Arınç ÜNAL wrote:
> On 15.04.2023 03:52, Daniel Golle wrote:
>> On Sat, Apr 15, 2023 at 03:28:55AM +0300, Arınç ÜNAL wrote:
>>> On 15.04.2023 02:53, Daniel Golle wrote:
>>>> On Sat, Apr 15, 2023 at 02:23:16AM +0300, Arınç ÜNAL wrote:
>>>>> On 15.04.2023 01:48, Daniel Golle wrote:
>>>>>> On Sat, Apr 15, 2023 at 01:41:07AM +0300, Arınç ÜNAL wrote:
>>>>>>> Hey there,
>>>>>>>
>>>>>>> I've been working on the MT7530 DSA subdriver. While doing some
>>>>>>> tests, I
>>>>>>> realised mt7530_probe() runs twice. I moved enabling the
>>>>>>> regulators from
>>>>>>> mt7530_setup() to mt7530_probe(). Enabling the regulators there
>>>>>>> ends up
>>>>>>> with exception warnings on the first time. It works fine when
>>>>>>> mt7530_probe() is run again.
>>>>>>>
>>>>>>> This should not be an expected behaviour, right? Any ideas how we
>>>>>>> can make
>>>>>>> it work the first time?
>>>>>>
>>>>>> Can you share the patch or work-in-progress tree which will allow me
>>>>>> to reproduce this problem?
>>>>>
>>>>> I tested this on vanilla 6.3-rc6. There's just the diff below that is
>>>>> applied. I encountered it on the standalone MT7530 on my Bananapi
>>>>> BPI-R2. I
>>>>> haven't tried it on MCM MT7530 on MT7621 SoC yet.
>>>>>
>>>>>>
>>>>>> It can of course be that regulator driver has not yet been loaded on
>>>>>> the first run and -EPROBE_DEFER is returned in that case. Knowing the
>>>>>> value of 'err' variable below would hence be valuable information.
>>>>>
>>>>> Regardless of enabling the regulator on either mt7530_probe() or
>>>>> mt7530_setup(), dsa_switch_parse_of() always fails.
>>>>
>>>> So dsa_switch_parse_of() can return -EPROBE_DEFER if the ethernet
>>>> driver responsible for the CPU port has not yet been loaded.
>>>>
>>>> See net/dsa/dsa.c (inside function dsa_port_parse_of):
>>>> [...]
>>>> 1232) master = of_find_net_device_by_node(ethernet);
>>>> 1233) of_node_put(ethernet);
>>>> 1234) if (!master)
>>>> 1235) return -EPROBE_DEFER;
>>>> [...]
>>>>
>>>> Hence it would be important to include the value of 'err' in your
>>>> debugging printf output, as -EPROBE_DEFER can be an expected and
>>>> implicitely intended reality and nothing is wrong then.
>>>
>>> Thanks Daniel. I can't do more tests soon but this is probably what's
>>> going
>>> on as the logs already indicate that the MediaTek ethernet driver was
>>> yet to
>>> load.
>>>
>>> As acknowledged, since running the MT7530 DSA subdriver from scratch is
>>> expected if the ethernet driver is not loaded yet, there's not really a
>>> problem. Though the switch is reset twice in a short amount of time.
>>> I don't
>>> think that's very great.
>>
>> That's true, and we should try to avoid that.
>>
>>>
>>> The driver initialisation seems serialised (at least for the drivers
>>> built
>>> into the kernel) as I tried sleeping for 5 seconds on mt7530_probe()
>>> but no
>>> other driver was loaded in the meantime so I got the same behaviour.
>>>
>>> The regulator code will cause a long and nasty exception the first time.
>>> Though there's nothing wrong as it does what it's supposed to do on the
>>> second run. I'm not sure if that's negligible.
>>>
>>> Could we at least somehow make the MT7530 DSA subdriver wait until the
>>> regulator driver is loaded?
>>
>> I assume the regulator-related stackdump is unrelated, but caused by
>> cpufreq changes, which had now been fixed by commit 0883426fd07e
>> ("cpufreq: mediatek: Raise proc and sram max voltage for MT7622/7623").
>>
>> If you are using v6.3-rc6 this commit is still missing there, but
>> manually picking it from linux-next should fix it.
>
> I did one better and just did the test on the current linux-next, I get
> exceptions that seem to be identical. I also made sure this commit was
> actually there.
>
>>
>> Let me know if I can help with testing on my farm of MediaTek boards.
>> I'm a bit nervous about fixing MT7531BE soon, so deciding if we move
>> PLL activation to mt7530_probe() would be essential as it makes the
>> fix much easier...
>
> Can you test this branch on MT7531AE, MT7531BE and the switch on MT7988
> SoC? I just need to complete the patch logs, the code won't change much.
>
> https://github.com/arinc9/linux/commits/for-netnext
>
> I'm thinking if we can -EPROBE_DEFER right at the start of
> mt7530_probe(), it should prevent the reset code from running twice, and
> enabling the regulator will run without any exceptions.
>
> I think I can just keep enabling the regulator on mt7530_setup() if I
> can't figure that out. On MT7623NI, the switch stops working after 35
> seconds with these logs. As long as the regulator is enabled before
> this, everything keeps working.
>
> [ 35.037200] vusb: disabling
> [ 35.040089] vmc: disabling
> [ 35.042856] vmch: disabling
> [ 35.045709] vgp1: disabling
> [ 35.049010] vcamaf: disabling
I was able to confirm the error code is -517, EPROBE_DEFER.
Arınç
Powered by blists - more mailing lists