[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALHNRZ_EbtHSXaDQ+1gGf3HjdyW5Q54EDN901-r8A_aXLbDJkw@mail.gmail.com>
Date: Tue, 2 Sep 2025 12:21:48 -0500
From: Aaron Kling <webgeek1234@...il.com>
To: Viresh Kumar <viresh.kumar@...aro.org>
Cc: Krzysztof Kozlowski <krzk@...nel.org>, Rob Herring <robh@...nel.org>, Conor Dooley <conor+dt@...nel.org>,
Thierry Reding <thierry.reding@...il.com>, Jonathan Hunter <jonathanh@...dia.com>,
"Rafael J. Wysocki" <rafael@...nel.org>, Krzysztof Kozlowski <krzk+dt@...nel.org>, linux-kernel@...r.kernel.org,
devicetree@...r.kernel.org, linux-tegra@...r.kernel.org,
linux-pm@...r.kernel.org
Subject: Re: [PATCH 3/8] cpufreq: tegra186: add OPP support and set bandwidth
On Mon, Sep 1, 2025 at 12:53 AM Viresh Kumar <viresh.kumar@...aro.org> wrote:
>
> On 31-08-25, 22:33, Aaron Kling via B4 Relay wrote:
> > diff --git a/drivers/cpufreq/tegra186-cpufreq.c b/drivers/cpufreq/tegra186-cpufreq.c
> > index bd94beebc4cc2fe6870e13ca55343cedb9729e99..f0abb44e2ed00a301161565e4c4f62cfed4a5814 100644
> > --- a/drivers/cpufreq/tegra186-cpufreq.c
> > +++ b/drivers/cpufreq/tegra186-cpufreq.c
> > @@ -18,6 +18,7 @@
> > #define EDVD_CORE_VOLT_FREQ_F_SHIFT 0
> > #define EDVD_CORE_VOLT_FREQ_F_MASK 0xffff
> > #define EDVD_CORE_VOLT_FREQ_V_SHIFT 16
> > +#define KHZ 1000
>
> Can reuse:
>
> include/linux/units.h:#define HZ_PER_KHZ 1000UL
Will do.
>
> > +static int tegra_cpufreq_set_bw(struct cpufreq_policy *policy, unsigned long freq_khz)
> > +{
> > + struct tegra186_cpufreq_data *data = cpufreq_get_driver_data();
> > + struct dev_pm_opp *opp;
> > + struct device *dev;
> > + int ret;
> > +
> > + dev = get_cpu_device(policy->cpu);
> > + if (!dev)
> > + return -ENODEV;
> > +
> > + opp = dev_pm_opp_find_freq_exact(dev, freq_khz * KHZ, true);
> > + if (IS_ERR(opp))
> > + return PTR_ERR(opp);
> > +
> > + ret = dev_pm_opp_set_opp(dev, opp);
>
> Won't it be easier to use dev_pm_opp_set_rate() instead ?
I'm not very familiar with the opp system. If I read correctly,
dev_pm_opp_set_rate() will round to the closest rate while this code
will fail if the exact rate isn't found. This code is based on the
existing tegra194-cpufreq driver. And I'm unsure if this was done for
a reason. I have seen unexpected rates returned from clk_round_rate in
the development of this topic, so that could be related.
>
> > + if (ret)
> > + data->icc_dram_bw_scaling = false;
> > +
> > + dev_pm_opp_put(opp);
>
> The OPP core supports scope based cleanup helpers now, maybe use them
> to remove all these put calls.
I will look into this.
>
> > + return ret;
> > +}
> > +
> > +static int tegra_cpufreq_init_cpufreq_table(struct cpufreq_policy *policy,
> > + struct cpufreq_frequency_table *bpmp_lut,
> > + struct cpufreq_frequency_table **opp_table)
> > +{
> > + struct tegra186_cpufreq_data *data = cpufreq_get_driver_data();
> > + struct cpufreq_frequency_table *freq_table = NULL;
> > + struct cpufreq_frequency_table *pos;
> > + struct device *cpu_dev;
> > + struct dev_pm_opp *opp;
> > + unsigned long rate;
> > + int ret, max_opps;
> > + int j = 0;
> > +
> > + cpu_dev = get_cpu_device(policy->cpu);
> > + if (!cpu_dev) {
> > + pr_err("%s: failed to get cpu%d device\n", __func__, policy->cpu);
> > + return -ENODEV;
> > + }
> > +
> > + /* Initialize OPP table mentioned in operating-points-v2 property in DT */
> > + ret = dev_pm_opp_of_add_table_indexed(cpu_dev, 0);
> > + if (!ret) {
>
> If you handle the error case here, then the below can move out of the
> if/else block.
I'd prefer not to deviate too much from the tegra194-cpufreq code this
is based on, so the drivers can be more easily kept in parity to each
other. But I will look at making this a bit cleaner as per this and
the next comment.
>
> > + max_opps = dev_pm_opp_get_opp_count(cpu_dev);
> > + if (max_opps <= 0) {
> > + dev_err(cpu_dev, "Failed to add OPPs\n");
> > + return max_opps;
> > + }
> > +
> > + /* Disable all opps and cross-validate against LUT later */
> > + for (rate = 0; ; rate++) {
>
> Maybe using while(1) would be more readable ?
>
> > + opp = dev_pm_opp_find_freq_ceil(cpu_dev, &rate);
> > + if (IS_ERR(opp))
> > + break;
> > +
> > + dev_pm_opp_put(opp);
> > + dev_pm_opp_disable(cpu_dev, rate);
> > + }
> > + } else {
> > + dev_err(cpu_dev, "Invalid or empty opp table in device tree\n");
> > + data->icc_dram_bw_scaling = false;
> > + return ret;
> > + }
> > +
> > + freq_table = kcalloc((max_opps + 1), sizeof(*freq_table), GFP_KERNEL);
> > + if (!freq_table)
> > + return -ENOMEM;
> > +
> > + /*
> > + * Cross check the frequencies from BPMP-FW LUT against the OPP's present in DT.
> > + * Enable only those DT OPP's which are present in LUT also.
> > + */
> > + cpufreq_for_each_valid_entry(pos, bpmp_lut) {
> > + opp = dev_pm_opp_find_freq_exact(cpu_dev, pos->frequency * KHZ, false);
> > + if (IS_ERR(opp))
> > + continue;
> > +
> > + dev_pm_opp_put(opp);
> > +
> > + ret = dev_pm_opp_enable(cpu_dev, pos->frequency * KHZ);
> > + if (ret < 0)
> > + return ret;
> > +
> > + freq_table[j].driver_data = pos->driver_data;
> > + freq_table[j].frequency = pos->frequency;
> > + j++;
> > + }
> > +
> > + freq_table[j].driver_data = pos->driver_data;
> > + freq_table[j].frequency = CPUFREQ_TABLE_END;
> > +
> > + *opp_table = &freq_table[0];
> > +
> > + dev_pm_opp_set_sharing_cpus(cpu_dev, policy->cpus);
> > +
> > + tegra_cpufreq_set_bw(policy, freq_table[j - 1].frequency);
>
> Maybe a comment on why exactly you are changing the freq here ?
To my knowledge, this does not change any clocks. The intent here is
to prime the interconnect data. In the pre-req series, there's a
change that sets all clocks to max frequency during probe. Then my use
case involves setting performance governor by default on some boots.
During testing, I noticed that the interconnect data provided by this
driver was all zeroes. Which led me to notice that set_bw is only
called when the target frequency changes. Which wasn't happening
because clocks were already set to max. If my understanding here is
wrong or there's a better way to handle this, I will fix it.
Aaron
Powered by blists - more mailing lists