lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 31 Aug 2021 12:26:30 -0700
From:   Saravana Kannan <saravanak@...gle.com>
To:     Andrew Lunn <andrew@...n.ch>
Cc:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Linus Walleij <linus.walleij@...aro.org>,
        Vivien Didelot <vivien.didelot@...il.com>,
        Florian Fainelli <f.fainelli@...il.com>,
        Vladimir Oltean <olteanv@...il.com>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>, Len Brown <lenb@...nel.org>,
        Alvin Sipraga <ALSI@...g-olufsen.dk>, kernel-team@...roid.com,
        linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
        linux-acpi@...r.kernel.org
Subject: Re: [PATCH v1 1/2] driver core: fw_devlink: Add support for FWNODE_FLAG_BROKEN_PARENT

On Tue, Aug 31, 2021 at 6:16 AM Andrew Lunn <andrew@...n.ch> wrote:
>
> > > I must admit, my main problem at the moment is -rc1 in two weeks
> > > time. It seems like a number of board with Ethernet switches will be
> > > broken, that worked before. phy-handle is not limited to switch
> > > drivers, it is also used for Ethernet drivers. So it could be, a
> > > number of Ethernet drivers are also going to be broken in -rc1?
> >
> > Again, in those cases, based on your FEC example, fw_devlink=on
> > actually improves things.
>
> Debatable. I did some testing. As expected some boards with Ethernet
> switches are now broken.

To clarify myself, I'm saying the patch to parse "ethernet" [8] will
make things better with fw_devlink=on for FEC. Not sure if you tested
that patch or not.

And yes, "phy-handle" will make things worse for switches because it
has two issues that need to be fixed. One is this deferred probe thing
for which I gave a patch in the previous email and the other is what
Marek reported (fix in the works). So can you revert "phy-handle"
support and test with [8] and you should see things be better with
fw_devlink=on.

[8] - https://lore.kernel.org/netdev/CAGETcx9=AyEfjX_-adgRuX=8a0MkLnj8sy2KJGhxpNCinJu4yA@mail.gmail.com/

> Without fw_devlink=on, some boards are not
> optimal, but they actually work. With it, they are broken.
>
> I did a bisect, and they have been broken since:
>
> ea718c699055c8566eb64432388a04974c43b2ea is the first bad commit
> commit ea718c699055c8566eb64432388a04974c43b2ea
> Author: Saravana Kannan <saravanak@...gle.com>
> Date:   Tue Mar 2 13:11:32 2021 -0800
>
>     Revert "Revert "driver core: Set fw_devlink=on by default""
>
>     This reverts commit 3e4c982f1ce75faf5314477b8da296d2d00919df.
>
>     Since all reported issues due to fw_devlink=on should be addressed by
>     this series, revert the revert. fw_devlink=on Take II.
>
>     Signed-off-by: Saravana Kannan <saravanak@...gle.com>
>     Link: https://lore.kernel.org/r/20210302211133.2244281-4-saravanak@google.com
>     Signed-off-by: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
>
>  drivers/base/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> So however it is fixed, it needs to go into stable, not just -rc1.

Not sure what was the tip of the tree with which you bisected. For
example, if phy-handle broke things, bisect could still point at this
patch above depending on whether the first bisect is good/bad. Because
reverting this patch effectively disabled phy-handle parsing too.

To be clear, I'm not saying things aren't broken. I'm just pointing
out some nuances with the bisect that we need to be aware of.

> > Again, it's not a widespread problem as I explained before.
> > fw_devlink=on has been the default for 2 kernel versions now. With no
> > unfixed reported issues.
>
> Given that some Ethernet switches have been broken all that time, i
> wonder what else has been broken? Normally, the kernel which is
> release in December becomes the next LTS. It then gets picked up by
> the distros and more wide spread tested. So it could be, you get a
> flood of reports in January and February about things which are
> broken. This is why i don't think you should be relying on bug
> reports, you should be taking a more proactive stance and trying to
> analyse the DTB blobs.

As I mentioned earlier, just looking at DTB doesn't tell me much
because the driver could still be fine depending on how it's written.
Also, I don't have an easy way to do this. If I find a way, I'll do
it.

Btw, most bug reports that have been raised have been fixed with
generic fixes that address general DT patterns. For example,
fw_devlink has cycle detection built in, has support for devices that
never get probed, etc. Enabling fw_devlink=on and handling bug reports
with generic fixes has worked well so far to get fw_devlink to where
it is today. I've tried to be very quick in responding to fw_devlink
issues -- and that has worked out so far and hopefully it'll continue
to work out.

> I will spend some time trying out your proposed fix. See if they are a
> quick fix for stable.

Thanks for testing it out. And worst case, we'll revert phy-handle support.


-Saravana

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ