[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGETcx_QPh=ppHzBdM2_TYZz3o+O7Ab9-JSY52Yz1--iLnykxA@mail.gmail.com>
Date: Tue, 31 Aug 2021 12:26:30 -0700
From: Saravana Kannan <saravanak@...gle.com>
To: Andrew Lunn <andrew@...n.ch>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
"Rafael J. Wysocki" <rafael@...nel.org>,
Linus Walleij <linus.walleij@...aro.org>,
Vivien Didelot <vivien.didelot@...il.com>,
Florian Fainelli <f.fainelli@...il.com>,
Vladimir Oltean <olteanv@...il.com>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, Len Brown <lenb@...nel.org>,
Alvin Sipraga <ALSI@...g-olufsen.dk>, kernel-team@...roid.com,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
linux-acpi@...r.kernel.org
Subject: Re: [PATCH v1 1/2] driver core: fw_devlink: Add support for FWNODE_FLAG_BROKEN_PARENT
On Tue, Aug 31, 2021 at 6:16 AM Andrew Lunn <andrew@...n.ch> wrote:
>
> > > I must admit, my main problem at the moment is -rc1 in two weeks
> > > time. It seems like a number of board with Ethernet switches will be
> > > broken, that worked before. phy-handle is not limited to switch
> > > drivers, it is also used for Ethernet drivers. So it could be, a
> > > number of Ethernet drivers are also going to be broken in -rc1?
> >
> > Again, in those cases, based on your FEC example, fw_devlink=on
> > actually improves things.
>
> Debatable. I did some testing. As expected some boards with Ethernet
> switches are now broken.
To clarify myself, I'm saying the patch to parse "ethernet" [8] will
make things better with fw_devlink=on for FEC. Not sure if you tested
that patch or not.
And yes, "phy-handle" will make things worse for switches because it
has two issues that need to be fixed. One is this deferred probe thing
for which I gave a patch in the previous email and the other is what
Marek reported (fix in the works). So can you revert "phy-handle"
support and test with [8] and you should see things be better with
fw_devlink=on.
[8] - https://lore.kernel.org/netdev/CAGETcx9=AyEfjX_-adgRuX=8a0MkLnj8sy2KJGhxpNCinJu4yA@mail.gmail.com/
> Without fw_devlink=on, some boards are not
> optimal, but they actually work. With it, they are broken.
>
> I did a bisect, and they have been broken since:
>
> ea718c699055c8566eb64432388a04974c43b2ea is the first bad commit
> commit ea718c699055c8566eb64432388a04974c43b2ea
> Author: Saravana Kannan <saravanak@...gle.com>
> Date: Tue Mar 2 13:11:32 2021 -0800
>
> Revert "Revert "driver core: Set fw_devlink=on by default""
>
> This reverts commit 3e4c982f1ce75faf5314477b8da296d2d00919df.
>
> Since all reported issues due to fw_devlink=on should be addressed by
> this series, revert the revert. fw_devlink=on Take II.
>
> Signed-off-by: Saravana Kannan <saravanak@...gle.com>
> Link: https://lore.kernel.org/r/20210302211133.2244281-4-saravanak@google.com
> Signed-off-by: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
>
> drivers/base/core.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> So however it is fixed, it needs to go into stable, not just -rc1.
Not sure what was the tip of the tree with which you bisected. For
example, if phy-handle broke things, bisect could still point at this
patch above depending on whether the first bisect is good/bad. Because
reverting this patch effectively disabled phy-handle parsing too.
To be clear, I'm not saying things aren't broken. I'm just pointing
out some nuances with the bisect that we need to be aware of.
> > Again, it's not a widespread problem as I explained before.
> > fw_devlink=on has been the default for 2 kernel versions now. With no
> > unfixed reported issues.
>
> Given that some Ethernet switches have been broken all that time, i
> wonder what else has been broken? Normally, the kernel which is
> release in December becomes the next LTS. It then gets picked up by
> the distros and more wide spread tested. So it could be, you get a
> flood of reports in January and February about things which are
> broken. This is why i don't think you should be relying on bug
> reports, you should be taking a more proactive stance and trying to
> analyse the DTB blobs.
As I mentioned earlier, just looking at DTB doesn't tell me much
because the driver could still be fine depending on how it's written.
Also, I don't have an easy way to do this. If I find a way, I'll do
it.
Btw, most bug reports that have been raised have been fixed with
generic fixes that address general DT patterns. For example,
fw_devlink has cycle detection built in, has support for devices that
never get probed, etc. Enabling fw_devlink=on and handling bug reports
with generic fixes has worked well so far to get fw_devlink to where
it is today. I've tried to be very quick in responding to fw_devlink
issues -- and that has worked out so far and hopefully it'll continue
to work out.
> I will spend some time trying out your proposed fix. See if they are a
> quick fix for stable.
Thanks for testing it out. And worst case, we'll revert phy-handle support.
-Saravana
Powered by blists - more mailing lists