lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2192003.Icojqenx9y@diego>
Date: Fri, 19 Apr 2024 16:31:05 +0200
From: Heiko Stübner <heiko@...ech.de>
To: Alban Browaeys <alban.browaeys@...il.com>, Conor Dooley <conor@...nel.org>
Cc: dev@...ker-schwesinger.de, Vinod Koul <vkoul@...nel.org>,
 Kishon Vijay Abraham I <kishon@...nel.org>,
 Chris Ruehl <chris.ruehl@...ys.com.hk>, Rob Herring <robh@...nel.org>,
 Krzysztof Kozlowski <krzysztof.kozlowski+dt@...aro.org>,
 Conor Dooley <conor+dt@...nel.org>,
 Christopher Obbard <chris.obbard@...labora.com>,
 Doug Anderson <dianders@...omium.org>,
 Brian Norris <briannorris@...omium.org>,
 Jensen Huang <jensenhuang@...endlyarm.com>, linux-phy@...ts.infradead.org,
 linux-arm-kernel@...ts.infradead.org, linux-rockchip@...ts.infradead.org,
 linux-kernel@...r.kernel.org, devicetree@...r.kernel.org
Subject: Re: [PATCH 1/3] phy: rockchip: emmc: Enable pulldown for strobe line

Am Donnerstag, 11. April 2024, 17:42:24 CEST schrieb Conor Dooley:
> On Wed, Apr 10, 2024 at 08:28:57PM +0200, Alban Browaeys wrote:
> > Le jeudi 28 mars 2024 à 18:01 +0000, Conor Dooley a écrit :
> > > On Thu, Mar 28, 2024 at 06:00:03PM +0100, Alban Browaeys wrote:
> > > > Le mardi 26 mars 2024 à 19:46 +0000, Conor Dooley a écrit :
> > > > > On Tue, Mar 26, 2024 at 07:54:35PM +0100, Folker Schwesinger via
> > > > > B4
> > > > > Relay wrote:
> > > > > > From: Folker Schwesinger <dev@...ker-schwesinger.de>
> > > > > > -	if (of_property_read_bool(dev->of_node,
> > > > > > "rockchip,enable-
> > > > > > strobe-pulldown"))
> > > > > > -		rk_phy->enable_strobe_pulldown =
> > > > > > PHYCTRL_REN_STRB_ENABLE;
> > > > > > +	if (of_property_read_bool(dev->of_node,
> > > > > > "rockchip,disable-
> > > > > > strobe-pulldown"))
> > > > > > +		rk_phy->enable_strobe_pulldown =
> > > > > > PHYCTRL_REN_STRB_DISABLE;
> > > > > 
> > > > > Unfortunately you cannot do this.
> > > > > Previously no property at all meant disabled and a property was
> > > > > required
> > > > > to enable it. With this change the absence of a property means
> > > > > that
> > > > > it
> > > > > will be enabled.
> > > > > An old devicetree is that wanted this to be disabled would have
> > > > > no
> > > > > property and will now end up with it enabled. This is an ABI
> > > > > break
> > > > > and is
> > > > > clearly not backwards compatible, that's a NAK unless it is
> > > > > demonstrable
> > > > > that noone actually wants to disable it at all.
> > > > 
> > > > 
> > > > But the patch that introduced the new default to disable the
> > > > pulldown
> > > > explicitely introduced a regression for at least 4 boards.
> > > > It took time to sort out that the default to disable pulldown was
> > > > the
> > > > culprit but still.
> > > > Will we carry this new behavor that breaks the default design for
> > > > rk3399 because since the regression was introduced new board
> > > > definition
> > > > might have expceted this new behavior.
> > > > 
> > > > Could the best option be to revert to énot set a default
> > > > enable/disable
> > > > pulldown" (as before the commit that introduced the regression) and
> > > > allow one to force the pulldown via the enable/disable pulldown
> > > > property?
> > > > I mean the commit that introduced a default value for the pulldown
> > > > did
> > > > not seem to be about fixing anything. But it broke a lot. ANd it
> > > > was
> > > > really really hard to find the description of this commit to
> > > > understand
> > > > that one had to enable pulldown to restore hs400.
> > > > 
> > > > In more than 3 years, only one board maintainer noticed that this
> > > > property was required to get back HS400  and thanks to a user
> > > > telling
> > > > me that this board was working I found from this board that this
> > > > property was "missing" from most board definitions (while it was
> > > > not
> > > > required before).
> > > > 
> > > > 
> > > > I am all for not breaking ABI. But what about not reverting a patch
> > > > that already broke ABI because this patch introduced a new ABI that
> > > > we
> > > > don't want to break?
> > > > I mean shouldn't a new commit with a new ABI that regressed the
> > > > kernel
> > > > be reverted?
> > > 
> > > I think I said it after OP replied to me yesterday, but this is a
> > > pretty
> > > shitty situation in that the original default picked for the property
> > > was actually incorrect. Given it's been like this for four years
> > > before
> > > anyone noticed, and others probably depend on the current behaviour,
> > > that's hard to justify.
> > > 
> > 
> > A lot of people noticed fast that HS400 was broken in the 5.10 branch
> > but due to another commit (more later, ie double regulator init that
> > messed up emmc) this second breakage was not detected. But mostly
> > downstream. And most if not all rk3399 boards in Armbian had HS400
> > disabled.
> > 
> > 
> > It took 3 years to detect that HS400 was broken on a few boards like
> > Rock Pi4 in the upstream kernel. Any might still be broken.
> > I would not count on the fact that keeping the current behavior equals
> > no more broken boards.
> > 
> > From the previous exchanges the boards that requires the pulldown to be
> > disabled seems well known.
> > 
> > Though I am fine with adding a property to set enable pulldown to any
> > board definition file where that is required.
> > 
> > Only I do not believe keeping the statu quo equal everything works
> > because it has been 3 years.
> 
> FWIW, I didn't say this. Clearly if that was the case, this patch would
> never have arrived.
> 
> > In fact this commit reached the downstream kernels way later. Any
> > stayed with the 5.10 branch for years.
> > 
> > But on the other side the disable pulldown by default is alraedy in
> > stable/linux-rolling-lts .
> > 
> > > > Mind fixing the initial regression 8b5c2b45b8f0 "phy: rockchip: set
> > > > pulldown for strobe line in dts" does not necessarily mean changing
> > > > the
> > > > default to the opposite value but could also be reverting to not
> > > > setting a default.
> > > 
> > > That's also problematic, as the only way to do this is make setting
> > > one of the enabled or disabled properties required, which is also an
> > > ABI
> > > break, since you'd then be rejecting probe if one is not present.
> > 
> > 
> > I don't understand.
> > How reverting to not set either pulldown enabled or disabled by default
> > force all board to set either enabled or disabled.
> > I was telling about making the pulldown set by kernel optional be it
> > enabled or disabled to revert to the previous behavior. 
> > 
> > I mean before the patch to set a default pulldown value (to disabled)
> > there were no forced value.
> 
> Ah, maybe I misunderstood what the code originally did. Did the original
> code leave the bit however the bootloader or reset value had left it?
> In that case, probe wouldn't be rejected and you'd not have the sort of
> issue that I mentioned above.
> 
> > > > Though I don't know if there are pros to setting a default.
> > > 
> > > What you probably have to weigh up is the cons of each side. If what
> > > you
> > > lose is HS400 mode with what's in the kernel right now but switching
> > > to
> > > what's been proposed would entirely break some boards, I know which
> > > I think the lesser of two evils is.
> > 
> > More boards (even if not the most wide spread it seems) are broken by
> > the current behavior.
> > 
> > I agree that only HS400 is broken by keeping the status quo. But as far
> > as I understand only HS400 will be broken either way.
> > Be that by keeping the current disable pulldown which break the boards
> > based on the rockchip default design or the boards that are non-
> > standard or have a broken design.
> > Both case this lead to data corruption on boot to eMMC.
> > 
> > The only pro of keeping the current value the default is that most
> > board broken by the new default introduced in 2020 "might" already be
> > fixed (but that is just a guess).

which I guess are the least stale boards too.

> > > It's probably up to the platform maintainer to weigh in at this
> > > point.
> > 
> > I am not knowledged into the delegation scope. You mean that from now
> > on it is up to the rockchip maintainer?
> > I am fine with it either way.
> 
> Yes, I meant the rockchip maintainer. I'm only a lowly bindings
> maintainer, without any knowledge of rockchip specfics or the type of
> boards we're talking about being broken here. Someone has to make a
> judgement call about which "no property" behaviour is used going forward
> and I don't want that to be me!

I'm somehow all for not changing defaults again.

I think in the past there was a similar example in some other kernel part,
where some change broke the ABI, but meanwhile another ABI depended
on the changed behaviour, so a revert was not possible.

I think it's somewhat similar here. If the change has been in the kernel
for 3-4 years now, I do think that ship has sailed somehow.

As was said above, board introduced since 2020 might already be fixed
and essentially for boards that weren't, it does look like these didn't run
a mainline kernel for like 4 years now.

So if it comes down to deciding who to keep working, I'm more in favor of
those that did run on mainline in the years since.


Though not sure if I understood all the details here yet.


Heiko

> 
> > I just wanted to point out that maybe we don't have to set a pulldown
> > value after all. And that then all boards will be fine as before
> > setting the pulldown explicitly was introduced.
> 
> By "all boards will be fine" you mean "all boards that expected the
> kernel didn't touch this bit will be fine". The boards that need the
> kernel to set this bit because it {comes out of reset,is set by firmware}
> incorrectly are going to need a property added if we revert the default
> behaviour to not touching the bit.
> 
> > In fact I am more eager to get this fixed be it by adding a enable-
> > pulldown property to the board definitions, than to change the current
> > behavior.
> > Just wanted to sort out if that was not the wrong way to fix this
> > issue. (ie if adding a setting on most boards was wrong).
> 
> > During more than 2 years, I tried various patches and discussed on
> > forums about the HS400 breakage. I had bisected the regulator init
> > issue in the 5.10 branch. Sadly it took so much time for this issue to
> > be understood that when the force pulldown to disable commit was
> > introduced downstream before the first issue go fixed.
> > This only made the matter worse because when one fixed the double
> > regulator init issue HS400 was still broken, this time because the
> > pulldown was forced to disable. But nobody noticed this commit that
> > forced a default pulldown state (that was older than the regulator
> > commit from 5.13 backported to the 5.10 stable branch commit, but that
> > reached downstream later due to not having been backported to 5.10 from
> > 5.11).
> > Otherwise we would have emailed immeditaly.
> > Bisecting was only able to catch the first breakage (as it was only
> > fixed after the second breakage was introduced).
> > 
> > Maybe the problem is that me and others did not complained to the
> > kernel upstream ML because we were using heavily patched downstream
> > kernels (like most if not all downstream ARM kernels). So sadly, the
> > forums from back then are filled with complaints but nothing seemed to
> > have reached the Linux ML.
> 
> Aye, and all I can really say there is to buy boards from a vendor that
> doesn't use some horribly hacked downstream kernel, which I know is
> clearly an unsatisfactory suggestion. That said, we probably should have
> caught that the new default behaviour when the changes were made was not
> the default before. There was only one DT maintainer then though, and
> things just slip by :/
> 





Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ