lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20241205165251.fbf3ty6jgdqt4r3x@thinkpad>
Date: Thu, 5 Dec 2024 22:22:51 +0530
From: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>
To: Stephen Boyd <sboyd@...nel.org>
Cc: Johan Hovold <johan@...nel.org>, Viresh Kumar <viresh.kumar@...aro.org>,
	Johan Hovold <johan+linaro@...nel.org>,
	Michael Turquette <mturquette@...libre.com>,
	linux-clk@...r.kernel.org, linux-kernel@...r.kernel.org,
	regressions@...ts.linux.dev, Aishwarya TCV <aishwarya.tcv@....com>,
	Chuan Liu <chuan.liu@...ogic.com>,
	Sudeep Holla <sudeep.holla@....com>, linux-pm@...r.kernel.org
Subject: Re: [PATCH] Revert "clk: Fix invalid execution of clk_set_rate"

On Tue, Dec 03, 2024 at 11:30:07AM -0800, Stephen Boyd wrote:
> Quoting Manivannan Sadhasivam (2024-12-03 01:21:51)
> > On Tue, Dec 03, 2024 at 09:25:01AM +0100, Johan Hovold wrote:
> > > [ +CC: Viresh and Sudeep ]
> > > 
> > > On Mon, Dec 02, 2024 at 05:20:06PM -0800, Stephen Boyd wrote:
> > > > Quoting Johan Hovold (2024-12-02 02:06:21)
> > > > > This reverts commit 25f1c96a0e841013647d788d4598e364e5c2ebb7.
> > > > > 
> > > > > The offending commit results in errors like
> > > > > 
> > > > >         cpu cpu0: _opp_config_clk_single: failed to set clock rate: -22
> > > > > 
> > > > > spamming the logs on the Lenovo ThinkPad X13s and other Qualcomm
> > > > > machines when cpufreq tries to update the CPUFreq HW Engine clocks.
> > > > > 
> > > > > As mentioned in commit 4370232c727b ("cpufreq: qcom-hw: Add CPU clock
> > > > > provider support"):
> > > > > 
> > > > >         [T]he frequency supplied by the driver is the actual frequency
> > > > >         that comes out of the EPSS/OSM block after the DCVS operation.
> > > > >         This frequency is not same as what the CPUFreq framework has set
> > > > >         but it is the one that gets supplied to the CPUs after
> > > > >         throttling by LMh.
> > > > > 
> > > > > which seems to suggest that the driver relies on the previous behaviour
> > > > > of clk_set_rate().
> > > > 
> > > > I don't understand why a clk provider is needed there. Is anyone looking
> > > > into the real problem?
> > > 
> > > I mentioned this to Mani yesterday, but I'm not sure if he has had time
> > > to look into it yet. And I forgot to CC Viresh who was involved in
> > > implementing this. There is comment of his in the thread where this
> > > feature was added:
> > > 
> > >       Most likely no one will ever do clk_set_rate() on this new
> > >       clock, which is fine, though OPP core will likely do
> > >       clk_get_rate() here.
> > > 
> > > which may suggest that some underlying assumption has changed. [1]
> > > 
> 
> Yikes.
> 
> > 
> > I just looked into the issue this morning. The commit that triggered the errors
> > seem to be doing the right thing (although the commit message was a bit hard to
> > understand), but the problem is this check which gets triggered now:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/clk/clk.c?h=v6.13-rc1#n2319
> > 
> > Since the qcom-cpufreq* clocks doesn't have parents now (they should've been
> > defined anyway) and there is no CLK_SET_RATE_PARENT flag set, the check returns
> > NULL for the 'top' clock. Then clk_core_set_rate_nolock() returns -EINVAL,
> > causing the reported error.
> > 
> > But I don't quite understand why clk_core_set_rate_nolock() fails if there is no
> > parent or CLK_SET_RATE_PARENT is not set. The API is supposed to set the rate of
> > the passed clock irrespective of the parent. Propagating the rate change to
> > parent is not strictly needed and doesn't make sense if the parent is a fixed
> > clock like XO.
> 
> The recalc_rate clk_op is telling the framework that the clk is at a
> different rate than is requested by the clk consumer _and_ than what the
> framework thinks the clk is currently running at. The clk_set_rate()
> call is going to attempt to satisfy that request, and because there
> isn't a determine_rate/round_rate clk_op it assumes the clk can't change
> rate so it looks to see if there's a parent that can be changed to
> satisfy the rate. There isn't a parent either, so the clk_set_rate()
> call fails because the rate can't be achieved on this clk.
> 
> It may work to have a determine_rate clk_op that is like the recalc_rate
> one that says "this rate you requested is going to turn into whatever
> the hardware is running at" by simply returning the rate that the clk is
> running at.

Sounds reasonable to me. Fix submitted incorporating your suggestion, thanks!

- Mani

-- 
மணிவண்ணன் சதாசிவம்

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ