[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7c6df194-fce1-401a-98c5-c903d78627c4@leemhuis.info>
Date: Fri, 10 May 2024 10:54:25 +0200
From: "Linux regression tracking (Thorsten Leemhuis)"
<regressions@...mhuis.info>
To: Ulf Hansson <ulf.hansson@...aro.org>
Cc: Viresh Kumar <viresh.kumar@...aro.org>, Viresh Kumar
<vireshk@...nel.org>, Nishanth Menon <nm@...com>,
Stephen Boyd <sboyd@...nel.org>, "Rafael J. Wysocki" <rafael@...nel.org>,
linux-pm@...r.kernel.org, Vincent Guittot <vincent.guittot@...aro.org>,
Vladimir Lypak <vladimir.lypak@...il.com>, linux-kernel@...r.kernel.org,
Linux kernel regressions list <regressions@...ts.linux.dev>
Subject: Re: [PATCH V2] OPP: Fix required_opp_tables for multiple genpds using
same table
On 10.05.24 10:37, Ulf Hansson wrote:
> On Thu, 9 May 2024 at 14:35, Thorsten Leemhuis
> <regressions@...mhuis.info> wrote:
>> On 12.04.24 08:41, Viresh Kumar wrote:
>>> The required_opp_tables parsing is not perfect, as the OPP core does the
>>> parsing solely based on the DT node pointers.
>>>
>>> The core sets the required_opp_tables entry to the first OPP table in
>>> the "opp_tables" list, that matches with the node pointer.
>>>
>>> If the target DT OPP table is used by multiple devices and they all
>>> create separate instances of 'struct opp_table' from it, then it is
>>> possible that the required_opp_tables entry may be set to the incorrect
>>> sibling device.
>>>
>>> Unfortunately, there is no clear way to initialize the right values
>>> during the initial parsing and we need to do this at a later point of
>>> time.
>>>
>>> Cross check the OPP table again while the genpds are attached and fix
>>> them if required.
>>>
>>> Also add a new API for the genpd core to fetch the device pointer for
>>> the genpd.
>>>
>>> Cc: Thorsten Leemhuis <regressions@...mhuis.info>
>>> Reported-by: Vladimir Lypak <vladimir.lypak@...il.com>
>>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218682
>>
>> Did this fall through the cracks? Just wondering, as from here it looks
>> like for about four weeks now nothing happened to fix the regression
>> linked above. But I might have missed something. Or is everybody waiting
>> for a test from the reporter?
>
> I have chatted a bit with Viresh about this problem offlist, while
> both me and him are/have been on vacations. Sorry for the delay and
> confusion.
>
> The latest update from my side is that I am working on a solution,
> that aim to remove the entire dev|devm_pm_opp_detach_genpd() API.
That sounds like something that would have to wait for a merge window;
so given the timing I assume this would mean that the earliest point in
time to merge this would be for 6.11-rc1, which is ~2 months away --
plus another 9 or 10 weeks until the fix would reach users.
> Instead, the plan is to move consumer drivers to use
> dev_pm_domain_attach_list() to attach multiple PM domains per device.
> When it comes to hooking up the required-opps-tables/devs, I think
> genpd should be able to manage this during the device attach process.
> In this way, consumer drivers shouldn't need to care about this at
> all.
>
> That said, I am hoping that $subject patch should not be needed.
> Although, I need a bit more time before I am ready to post a patchset
> for the above.
>
> What do you think?
Given that the report is already more than a month old now and what I
assumed above (which might be wrong), this makes me wonder: is there a
downside if we apply this patch now, and simply revert this later when
your proper solution is merged? I would assume that is what Linus want
in this case to honor the "no regressions" rule.
Might be something different if this is something like a really odd
corner case we assume nobody (or nearly nobody) will run into in
practice. But as somebody noticed this, I assume that is not the case.
Ciao, Thorsten
>>> Co-developed-by: Vladimir Lypak <vladimir.lypak@...il.com>
>>> Signed-off-by: Viresh Kumar <viresh.kumar@...aro.org>
>>> ---
>>> V2:
>>> - Fix an `if` condition.
>>> - s/Bugzilla/Closes/ and change ordering.
>>>
>>> drivers/opp/core.c | 31 ++++++++++++++++++++++++++++++-
>>> drivers/pmdomain/core.c | 10 ++++++++++
>>> include/linux/pm_domain.h | 6 ++++++
>>> 3 files changed, 46 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/opp/core.c b/drivers/opp/core.c
>>> index e233734b7220..cb4611fe1b5b 100644
>>> --- a/drivers/opp/core.c
>>> +++ b/drivers/opp/core.c
>>> @@ -2394,7 +2394,8 @@ static void _opp_detach_genpd(struct opp_table *opp_table)
>>> static int _opp_attach_genpd(struct opp_table *opp_table, struct device *dev,
>>> const char * const *names, struct device ***virt_devs)
>>> {
>>> - struct device *virt_dev;
>>> + struct device *virt_dev, *gdev;
>>> + struct opp_table *genpd_table;
>>> int index = 0, ret = -EINVAL;
>>> const char * const *name = names;
>>>
>>> @@ -2427,6 +2428,34 @@ static int _opp_attach_genpd(struct opp_table *opp_table, struct device *dev,
>>> goto err;
>>> }
>>>
>>> + /*
>>> + * The required_opp_tables parsing is not perfect, as the OPP
>>> + * core does the parsing solely based on the DT node pointers.
>>> + * The core sets the required_opp_tables entry to the first OPP
>>> + * table in the "opp_tables" list, that matches with the node
>>> + * pointer.
>>> + *
>>> + * If the target DT OPP table is used by multiple devices and
>>> + * they all create separate instances of 'struct opp_table' from
>>> + * it, then it is possible that the required_opp_tables entry
>>> + * may be set to the incorrect sibling device.
>>> + *
>>> + * Cross check it again and fix if required.
>>> + */
>>> + gdev = dev_to_genpd_dev(virt_dev);
>>> + if (IS_ERR(gdev))
>>> + return PTR_ERR(gdev);
>>> +
>>> + genpd_table = _find_opp_table(gdev);
>>> + if (!IS_ERR(genpd_table)) {
>>> + if (genpd_table != opp_table->required_opp_tables[index]) {
>>> + dev_pm_opp_put_opp_table(opp_table->required_opp_tables[index]);
>>> + opp_table->required_opp_tables[index] = genpd_table;
>>> + } else {
>>> + dev_pm_opp_put_opp_table(genpd_table);
>>> + }
>>> + }
>>> +
>>> /*
>>> * Add the virtual genpd device as a user of the OPP table, so
>>> * we can call dev_pm_opp_set_opp() on it directly.
>>> diff --git a/drivers/pmdomain/core.c b/drivers/pmdomain/core.c
>>> index 4215ffd9b11c..c40eda92a85a 100644
>>> --- a/drivers/pmdomain/core.c
>>> +++ b/drivers/pmdomain/core.c
>>> @@ -184,6 +184,16 @@ static struct generic_pm_domain *dev_to_genpd(struct device *dev)
>>> return pd_to_genpd(dev->pm_domain);
>>> }
>>>
>>> +struct device *dev_to_genpd_dev(struct device *dev)
>>> +{
>>> + struct generic_pm_domain *genpd = dev_to_genpd(dev);
>>> +
>>> + if (IS_ERR(genpd))
>>> + return ERR_CAST(genpd);
>>> +
>>> + return &genpd->dev;
>>> +}
>>> +
>>> static int genpd_stop_dev(const struct generic_pm_domain *genpd,
>>> struct device *dev)
>>> {
>>> diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
>>> index 772d3280d35f..f24546a3d3db 100644
>>> --- a/include/linux/pm_domain.h
>>> +++ b/include/linux/pm_domain.h
>>> @@ -260,6 +260,7 @@ int pm_genpd_remove_subdomain(struct generic_pm_domain *genpd,
>>> int pm_genpd_init(struct generic_pm_domain *genpd,
>>> struct dev_power_governor *gov, bool is_off);
>>> int pm_genpd_remove(struct generic_pm_domain *genpd);
>>> +struct device *dev_to_genpd_dev(struct device *dev);
>>> int dev_pm_genpd_set_performance_state(struct device *dev, unsigned int state);
>>> int dev_pm_genpd_add_notifier(struct device *dev, struct notifier_block *nb);
>>> int dev_pm_genpd_remove_notifier(struct device *dev);
>>> @@ -307,6 +308,11 @@ static inline int pm_genpd_remove(struct generic_pm_domain *genpd)
>>> return -EOPNOTSUPP;
>>> }
>>>
>>> +static inline struct device *dev_to_genpd_dev(struct device *dev)
>>> +{
>>> + return ERR_PTR(-EOPNOTSUPP);
>>> +}
>>> +
>>> static inline int dev_pm_genpd_set_performance_state(struct device *dev,
>>> unsigned int state)
>>> {
>
>
Powered by blists - more mailing lists