lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <313d5a24b6cffa1a9160e624bb6855aa7f66589e.camel@gmail.com>
Date: Wed, 10 Apr 2024 20:28:57 +0200
From: Alban Browaeys <alban.browaeys@...il.com>
To: Conor Dooley <conor@...nel.org>
Cc: dev@...ker-schwesinger.de, Vinod Koul <vkoul@...nel.org>, Kishon Vijay
 Abraham I <kishon@...nel.org>, Heiko Stuebner <heiko@...ech.de>, Chris
 Ruehl <chris.ruehl@...ys.com.hk>,  Rob Herring <robh@...nel.org>, Krzysztof
 Kozlowski <krzysztof.kozlowski+dt@...aro.org>, Conor Dooley
 <conor+dt@...nel.org>, Christopher Obbard <chris.obbard@...labora.com>,
 Doug Anderson <dianders@...omium.org>, Brian Norris
 <briannorris@...omium.org>, Jensen Huang <jensenhuang@...endlyarm.com>,
 linux-phy@...ts.infradead.org,  linux-arm-kernel@...ts.infradead.org,
 linux-rockchip@...ts.infradead.org,  linux-kernel@...r.kernel.org,
 devicetree@...r.kernel.org
Subject: Re: [PATCH 1/3] phy: rockchip: emmc: Enable pulldown for strobe line

Le jeudi 28 mars 2024 à 18:01 +0000, Conor Dooley a écrit :
> On Thu, Mar 28, 2024 at 06:00:03PM +0100, Alban Browaeys wrote:
> > Le mardi 26 mars 2024 à 19:46 +0000, Conor Dooley a écrit :
> > > On Tue, Mar 26, 2024 at 07:54:35PM +0100, Folker Schwesinger via
> > > B4
> > > Relay wrote:
> > > > From: Folker Schwesinger <dev@...ker-schwesinger.de>
> > > > -	if (of_property_read_bool(dev->of_node,
> > > > "rockchip,enable-
> > > > strobe-pulldown"))
> > > > -		rk_phy->enable_strobe_pulldown =
> > > > PHYCTRL_REN_STRB_ENABLE;
> > > > +	if (of_property_read_bool(dev->of_node,
> > > > "rockchip,disable-
> > > > strobe-pulldown"))
> > > > +		rk_phy->enable_strobe_pulldown =
> > > > PHYCTRL_REN_STRB_DISABLE;
> > > 
> > > Unfortunately you cannot do this.
> > > Previously no property at all meant disabled and a property was
> > > required
> > > to enable it. With this change the absence of a property means
> > > that
> > > it
> > > will be enabled.
> > > An old devicetree is that wanted this to be disabled would have
> > > no
> > > property and will now end up with it enabled. This is an ABI
> > > break
> > > and is
> > > clearly not backwards compatible, that's a NAK unless it is
> > > demonstrable
> > > that noone actually wants to disable it at all.
> > 
> > 
> > But the patch that introduced the new default to disable the
> > pulldown
> > explicitely introduced a regression for at least 4 boards.
> > It took time to sort out that the default to disable pulldown was
> > the
> > culprit but still.
> > Will we carry this new behavor that breaks the default design for
> > rk3399 because since the regression was introduced new board
> > definition
> > might have expceted this new behavior.
> > 
> > Could the best option be to revert to énot set a default
> > enable/disable
> > pulldown" (as before the commit that introduced the regression) and
> > allow one to force the pulldown via the enable/disable pulldown
> > property?
> > I mean the commit that introduced a default value for the pulldown
> > did
> > not seem to be about fixing anything. But it broke a lot. ANd it
> > was
> > really really hard to find the description of this commit to
> > understand
> > that one had to enable pulldown to restore hs400.
> > 
> > In more than 3 years, only one board maintainer noticed that this
> > property was required to get back HS400  and thanks to a user
> > telling
> > me that this board was working I found from this board that this
> > property was "missing" from most board definitions (while it was
> > not
> > required before).
> > 
> > 
> > I am all for not breaking ABI. But what about not reverting a patch
> > that already broke ABI because this patch introduced a new ABI that
> > we
> > don't want to break?
> > I mean shouldn't a new commit with a new ABI that regressed the
> > kernel
> > be reverted?
> 
> I think I said it after OP replied to me yesterday, but this is a
> pretty
> shitty situation in that the original default picked for the property
> was actually incorrect. Given it's been like this for four years
> before
> anyone noticed, and others probably depend on the current behaviour,
> that's hard to justify.
> 

A lot of people noticed fast that HS400 was broken in the 5.10 branch
but due to another commit (more later, ie double regulator init that
messed up emmc) this second breakage was not detected. But mostly
downstream. And most if not all rk3399 boards in Armbian had HS400
disabled.


It took 3 years to detect that HS400 was broken on a few boards like
Rock Pi4 in the upstream kernel. Any might still be broken.
I would not count on the fact that keeping the current behavior equals
no more broken boards.

>From the previous exchanges the boards that requires the pulldown to be
disabled seems well known.

Though I am fine with adding a property to set enable pulldown to any
board definition file where that is required.

Only I do not believe keeping the statu quo equal everything works
because it has been 3 years.
In fact this commit reached the downstream kernels way later. Any
stayed with the 5.10 branch for years.

But on the other side the disable pulldown by default is alraedy in
stable/linux-rolling-lts .



> > Mind fixing the initial regression 8b5c2b45b8f0 "phy: rockchip: set
> > pulldown for strobe line in dts" does not necessarily mean changing
> > the
> > default to the opposite value but could also be reverting to not
> > setting a default.
> 
> That's also problematic, as the only way to do this is make setting
> one of the enabled or disabled properties required, which is also an
> ABI
> break, since you'd then be rejecting probe if one is not present.


I don't understand.
How reverting to not set either pulldown enabled or disabled by default
force all board to set either enabled or disabled.
I was telling about making the pulldown set by kernel optional be it
enabled or disabled to revert to the previous behavior. 

I mean before the patch to set a default pulldown value (to disabled)
there were no forced value.



> > Though I don't know if there are pros to setting a default.
> 
> What you probably have to weigh up is the cons of each side. If what
> you
> lose is HS400 mode with what's in the kernel right now but switching
> to
> what's been proposed would entirely break some boards, I know which
> I think the lesser of two evils is.

More boards (even if not the most wide spread it seems) are broken by
the current behavior.

I agree that only HS400 is broken by keeping the status quo. But as far
as I understand only HS400 will be broken either way.
Be that by keeping the current disable pulldown which break the boards
based on the rockchip default design or the boards that are non-
standard or have a broken design.
Both case this lead to data corruption on boot to eMMC.

The only pro of keeping the current value the default is that most
board broken by the new default introduced in 2020 "might" already be
fixed (but that is just a guess).



> It's probably up to the platform maintainer to weigh in at this
> point.

I am not knowledged into the delegation scope. You mean that from now
on it is up to the rockchip maintainer?
I am fine with it either way.

I just wanted to point out that maybe we don't have to set a pulldown
value after all. And that then all boards will be fine as before
setting the pulldown explicitly was introduced.


In fact I am more eager to get this fixed be it by adding a enable-
pulldown property to the board definitions, than to change the current
behavior.
Just wanted to sort out if that was not the wrong way to fix this
issue. (ie if adding a setting on most boards was wrong).


> Hope that helps?
> Conor.


During more than 2 years, I tried various patches and discussed on
forums about the HS400 breakage. I had bisected the regulator init
issue in the 5.10 branch. Sadly it took so much time for this issue to
be understood that when the force pulldown to disable commit was
introduced downstream before the first issue go fixed.
This only made the matter worse because when one fixed the double
regulator init issue HS400 was still broken, this time because the
pulldown was forced to disable. But nobody noticed this commit that
forced a default pulldown state (that was older than the regulator
commit from 5.13 backported to the 5.10 stable branch commit, but that
reached downstream later due to not having been backported to 5.10 from
5.11).
Otherwise we would have emailed immeditaly.
Bisecting was only able to catch the first breakage (as it was only
fixed after the second breakage was introduced).

Maybe the problem is that me and others did not complained to the
kernel upstream ML because we were using heavily patched downstream
kernels (like most if not all downstream ARM kernels). So sadly, the
forums from back then are filled with complaints but nothing seemed to
have reached the Linux ML.



About the regulator double init, stable downstream branches were hit by
a bug in the 5.10 stable branch in May 2021 before they switched to
5.11 were this default pulldown was introduced. Thus they could not
detect that this pulldown broke HS400 because HS400 was already broken
by a double regulator init, backported in 5.10 from 5.13:
"
commit 06653ebc0ad2e0b7d799cd71a5c2933ed2fb7a66
Author: Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
Date:   Thu May 20 01:12:23 2021 +0300

	regulator: core: resolve supply for boot-on/always-on
regulators
   
	commit 98e48cd9283dbac0e1445ee780889f10b3d1db6a upstream.
   
	For the boot-on/always-on regulators the
set_machine_constrainst() is
	called before resolving rdev->supply. Thus the code would try
to enable
	rdev before enabling supplying regulator. Enforce resolving
supply
	regulator before enabling rdev.
   
	Fixes: aea6cb99703e ("regulator: resolve supply after creating
regulator")
	Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
	Link:
https://lore.kernel.org/r/20210519221224.2868496-1-dmitry.baryshkov@linaro.org
	Signed-off-by: Mark Brown <broonie@...nel.org>
	Signed-off-by: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
"
and which to my knowledge was only fixed in v6.1-rc1 "
commit 8a866d527ac0441c0eb14a991fa11358b476b11d
Author: Christian Kohlschütter <christian@...lschutter.com>
Date:   Thu Aug 18 12:46:47 2022 +0000

    regulator: core: Resolve supply name earlier to prevent double-init
    
    Previously, an unresolved regulator supply reference upon calling
    regulator_register on an always-on or boot-on regulator caused
    set_machine_constraints to be called twice.
    
    This in turn may initialize the regulator twice, leading to voltage
    glitches that are timing-dependent. A simple, unrelated
configuration
    change may be enough to hide this problem, only to be surfaced by
    chance.
    
    One such example is the SD-Card voltage regulator in a NanoPI R4S
that
    would not initialize reliably unless the registration flow was just
    complex enough to allow the regulator to properly reset between
calls.
    
    Fix this by re-arranging regulator_register, trying resolve the
    regulator's supply early enough that set_machine_constraints does
not
    need to be called twice.
    
    Signed-off-by: Christian Kohlschütter <christian@...lschutter.com>
    Link:
https://lore.kernel.org/r/20220818124646.6005-1-christian@kohlschutter.com
    Signed-off-by: Mark Brown <broonie@...nel.org>

"

So most boards were already broken when the commit to force a pulldown
value was introduced.




Regards
Alban

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ