[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87073fc2-d7b3-98f4-0067-29430ea2adef@gmail.com>
Date: Thu, 19 Aug 2021 22:35:57 +0300
From: Dmitry Osipenko <digetx@...il.com>
To: Ulf Hansson <ulf.hansson@...aro.org>,
Viresh Kumar <viresh.kumar@...aro.org>
Cc: Thierry Reding <thierry.reding@...il.com>,
Jonathan Hunter <jonathanh@...dia.com>,
Viresh Kumar <vireshk@...nel.org>,
Stephen Boyd <sboyd@...nel.org>,
Peter De Schrijver <pdeschrijver@...dia.com>,
Mikko Perttunen <mperttunen@...dia.com>,
Peter Chen <peter.chen@...nel.org>,
Mark Brown <broonie@...nel.org>,
Lee Jones <lee.jones@...aro.org>,
Uwe Kleine-König <u.kleine-koenig@...gutronix.de>,
Nishanth Menon <nm@...com>,
Vignesh Raghavendra <vigneshr@...com>,
Richard Weinberger <richard@....at>,
Miquel Raynal <miquel.raynal@...tlin.com>,
Lucas Stach <dev@...xeye.de>, Stefan Agner <stefan@...er.ch>,
Adrian Hunter <adrian.hunter@...el.com>,
Mauro Carvalho Chehab <mchehab@...nel.org>,
Rob Herring <robh+dt@...nel.org>,
Michael Turquette <mturquette@...libre.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-tegra <linux-tegra@...r.kernel.org>,
Linux PM <linux-pm@...r.kernel.org>,
Linux USB List <linux-usb@...r.kernel.org>,
linux-staging@...ts.linux.dev, linux-spi@...r.kernel.org,
linux-pwm@...r.kernel.org, linux-mtd@...ts.infradead.org,
linux-mmc <linux-mmc@...r.kernel.org>,
Linux Media Mailing List <linux-media@...r.kernel.org>,
dri-devel <dri-devel@...ts.freedesktop.org>,
DTML <devicetree@...r.kernel.org>,
linux-clk <linux-clk@...r.kernel.org>
Subject: Re: [PATCH v8 01/34] opp: Add dev_pm_opp_sync() helper
19.08.2021 16:07, Ulf Hansson пишет:
> On Wed, 18 Aug 2021 at 17:43, Dmitry Osipenko <digetx@...il.com> wrote:
>>
>> 18.08.2021 13:08, Ulf Hansson пишет:
>>> On Wed, 18 Aug 2021 at 11:50, Viresh Kumar <viresh.kumar@...aro.org> wrote:
>>>>
>>>> On 18-08-21, 11:41, Ulf Hansson wrote:
>>>>> On Wed, 18 Aug 2021 at 11:14, Viresh Kumar <viresh.kumar@...aro.org> wrote:
>>>>>> What we need here is just configure. So something like this then:
>>>>>>
>>>>>> - genpd->get_performance_state()
>>>>>> -> dev_pm_opp_get_current_opp() //New API
>>>>>> -> dev_pm_genpd_set_performance_state(dev, current_opp->pstate);
>>>>>>
>>>>>> This can be done just once from probe() then.
>>>>>
>>>>> How would dev_pm_opp_get_current_opp() work? Do you have a suggestion?
>>>>
>>>> The opp core already has a way of finding current OPP, that's what
>>>> Dmitry is trying to use here. It finds it using clk_get_rate(), if
>>>> that is zero, it picks the lowest freq possible.
>>>>
>>>>> I am sure I understand the problem. When a device is getting probed,
>>>>> it needs to consume power, how else can the corresponding driver
>>>>> successfully probe it?
>>>>
>>>> Dmitry can answer that better, but a device doesn't necessarily need
>>>> to consume energy in probe. It can consume bus clock, like APB we
>>>> have, but the more energy consuming stuff can be left disabled until
>>>> the time a user comes up. Probe will just end up registering the
>>>> driver and initializing it.
>>>
>>> That's perfectly fine, as then it's likely that it won't vote for an
>>> OPP, but can postpone that as well.
>>>
>>> Perhaps the problem is rather that the HW may already carry a non-zero
>>> vote made from a bootloader. If the consumer driver tries to clear
>>> that vote (calling dev_pm_opp_set_rate(dev, 0), for example), it would
>>> still not lead to any updates of the performance state in genpd,
>>> because genpd internally has initialized the performance-state to
>>> zero.
>>
>> We don't need to discover internal SoC devices because we use
>> device-tree on ARM. For most devices power isn't required at a probe
>> time because probe function doesn't touch h/w at all, thus devices are
>> left in suspended state after probe.
>>
>> We have three components comprising PM on Tegra:
>>
>> 1. Power gate
>> 2. Clock state
>> 3. Voltage state
>>
>> GENPD on/off represents the 'power gate'.
>>
>> Clock and reset are controlled by device drivers using clk and rst APIs.
>>
>> Voltage state is represented by GENPD's performance level.
>>
>> GENPD core assumes that at a first rpm-resume of a consumer device, its
>> genpd_performance=0. Not true for Tegra because h/w of the device is
>> preconfigured to a non-zero perf level initially, h/w may not support
>> zero level at all.
>
> I think you may be misunderstanding genpd's behaviour around this, but
> let me elaborate.
>
> In genpd_runtime_resume(), we try to restore the performance state for
> the device that genpd_runtime_suspend() *may* have dropped earlier.
> That means, if genpd_runtime_resume() is called prior
> genpd_runtime_suspend() for the first time, it means that
> genpd_runtime_resume() will *not* restore a performance state, but
> instead just leave the performance state as is for the device (see
> genpd_restore_performance_state()).
>
> In other words, a consumer driver may use the following sequence to
> set an initial performance state for the device during ->probe():
>
> ...
> rate = clk_get_rate()
> dev_pm_opp_set_rate(rate)
>
> pm_runtime_enable()
> pm_runtime_resume_and_get()
> ...
>
> Note that, it's the consumer driver's responsibility to manage device
> specific resources, in its ->runtime_suspend|resume() callbacks.
> Typically that means dealing with clock gating/ungating, for example.
>
> In the other scenario where a consumer driver prefers to *not* call
> pm_runtime_resume_and_get() in its ->probe(), because it doesn't need
> to power on the device to complete probing, then we don't want to vote
> for an OPP at all - and we also want the performance state for the
> device in genpd to be set to zero. Correct?
Yes
> Is this the main problem you are trying to solve, because I think this
> doesn't work out of the box as of today?
The main problem is that the restored performance state is zero for the
first genpd_runtime_resume(), while it's not zero from the h/w perspective.
> There is another concern though, but perhaps it's not a problem after
> all. Viresh told us that dev_pm_opp_set_rate() may turn on resources
> like clock/regulators. That could certainly be problematic, in
> particular if the device and its genpd have OPP tables associated with
> it and the consumer driver wants to follow the above sequence in
> probe.
dev_pm_opp_set_rate() won't enable clocks and regulators, but it may
change the clock rate and voltage. This is also platform/driver specific
because it's up to OPP user how to configure OPP table. On Tegra we only
assign clock to OPP table, regulators are unused.
> Viresh, can you please chime in here and elaborate on some of the
> magic happening behind dev_pm_opp_set_rate() API - is there a problem
> here or not?
>
>>
>> GENPD core assumes that consumer devices can work at any performance
>> level. Not true for Tegra because voltage needs to be set in accordance
>> to the clock rate before clock is enabled, otherwise h/w won't work
>> properly, perhaps clock may be unstable or h/w won't be latching.
>
> Correct. Genpd relies on the callers to use the OPP framework if there
> are constraints like you describe above.
>
> That said, it's not forbidden for a consumer driver to call
> dev_pm_genpd_set_performance_state() directly, but then it better
> knows exactly what it's doing.
>
>>
>> Performance level should be set to 0 while device is suspended.
>
> Do you mean system suspend or runtime suspend? Or both?
Runtime suspend.
>> Performance level needs to be bumped on rpm-resume of a device in
>> accordance to h/w state before hardware is enabled.
>
> Assuming there was a performance state set for the device when
> genpd_runtime_suspend() was called, genpd_runtime_resume() will
> restore that state according to the sequence you described.
What do you think about adding API that will allow drivers to explicitly
set the restored performance state of a power domain?
Another option could be to change the GENPD core, making it to set the
rpm_pstate when dev_pm_genpd_set_performance_state(dev) is invoked and
device is rpm-suspended, instead of calling the
genpd->set_performance_state callback.
Then drivers will be able to sync the perf state at a probe time.
What do you think?
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index a934c679e6ce..cc15ab9eacc9 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -435,7 +435,7 @@ static void genpd_restore_performance_state(struct
device *dev,
int dev_pm_genpd_set_performance_state(struct device *dev, unsigned int
state)
{
struct generic_pm_domain *genpd;
- int ret;
+ int ret = 0;
genpd = dev_to_genpd_safe(dev);
if (!genpd)
@@ -446,7 +446,10 @@ int dev_pm_genpd_set_performance_state(struct
device *dev, unsigned int state)
return -EINVAL;
genpd_lock(genpd);
- ret = genpd_set_performance_state(dev, state);
+ if (pm_runtime_suspended(dev))
+ dev_gpd_data(dev)->rpm_pstate = state;
+ else
+ ret = genpd_set_performance_state(dev, state);
genpd_unlock(genpd);
return ret;
Powered by blists - more mailing lists