[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPDyKFrOSd2xEXuvDki9Em+xFLHfeTfZz3NtnWwNmWB1H6i=Kg@mail.gmail.com>
Date: Thu, 21 Oct 2021 20:11:55 +0200
From: Ulf Hansson <ulf.hansson@...aro.org>
To: "Rafael J. Wysocki" <rafael@...nel.org>
Cc: Daniel Lezcano <daniel.lezcano@...aro.org>,
Linux PM <linux-pm@...r.kernel.org>,
Maulik Shah <mkshah@...eaurora.org>,
Peter Zijlstra <peterz@...radead.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Len Brown <len.brown@...el.com>,
Bjorn Andersson <bjorn.andersson@...aro.org>,
Linux ARM <linux-arm-kernel@...ts.infradead.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/2] PM: sleep: Fix runtime PM based cpuidle support
On Thu, 21 Oct 2021 at 18:33, Rafael J. Wysocki <rafael@...nel.org> wrote:
>
> On Thu, Oct 21, 2021 at 6:17 PM Ulf Hansson <ulf.hansson@...aro.org> wrote:
> >
> > On Thu, 21 Oct 2021 at 17:09, Rafael J. Wysocki <rafael@...nel.org> wrote:
> > >
> > > On Thu, Oct 21, 2021 at 4:05 PM Ulf Hansson <ulf.hansson@...aro.org> wrote:
> > > >
> > > > On Thu, 21 Oct 2021 at 15:45, Rafael J. Wysocki <rafael@...nel.org> wrote:
> > > > >
> > > > > On Thu, Oct 21, 2021 at 1:49 PM Ulf Hansson <ulf.hansson@...aro.org> wrote:
> > > > > >
> > > > > > On Wed, 20 Oct 2021 at 20:18, Rafael J. Wysocki <rafael@...nel.org> wrote:
> > > > > > >
> > > > > > > On Wed, Sep 29, 2021 at 4:44 PM Ulf Hansson <ulf.hansson@...aro.org> wrote:
> > > > > > > >
> > > > > > > > In the cpuidle-psci case, runtime PM in combination with the generic PM
> > > > > > > > domain (genpd), may be used when entering/exiting an idlestate. More
> > > > > > > > precisely, genpd relies on runtime PM to be enabled for the attached device
> > > > > > > > (in this case it belongs to a CPU), to properly manage the reference
> > > > > > > > counting of its PM domain.
> > > > > > > >
> > > > > > > > This works fine most of the time, but during system suspend in the
> > > > > > > > dpm_suspend_late() phase, the PM core disables runtime PM for all devices.
> > > > > > > > Beyond this point and until runtime PM becomes re-enabled in the
> > > > > > > > dpm_resume_early() phase, calls to pm_runtime_get|put*() will fail.
> > > > > > > >
> > > > > > > > To make sure the reference counting in genpd becomes correct, we need to
> > > > > > > > prevent cpuidle-psci from using runtime PM when it has been disabled for
> > > > > > > > the device. Therefore, let's move the call to cpuidle_pause() from
> > > > > > > > dpm_suspend_noirq() to dpm_suspend_late() - and cpuidle_resume() from
> > > > > > > > dpm_resume_noirq() into dpm_resume_early().
> > > > > > > >
> > > > > > > > Diagnosed-by: Maulik Shah <mkshah@...eaurora.org>
> > > > > > > > Suggested-by: Maulik Shah <mkshah@...eaurora.org>
> > > > > > > > Signed-off-by: Ulf Hansson <ulf.hansson@...aro.org>
> > > > > > > > ---
> > > > > > > > drivers/base/power/main.c | 6 ++----
> > > > > > > > 1 file changed, 2 insertions(+), 4 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> > > > > > > > index cbea78e79f3d..1c753b651272 100644
> > > > > > > > --- a/drivers/base/power/main.c
> > > > > > > > +++ b/drivers/base/power/main.c
> > > > > > > > @@ -747,8 +747,6 @@ void dpm_resume_noirq(pm_message_t state)
> > > > > > > >
> > > > > > > > resume_device_irqs();
> > > > > > > > device_wakeup_disarm_wake_irqs();
> > > > > > > > -
> > > > > > > > - cpuidle_resume();
> > > > > > > > }
> > > > > > > >
> > > > > > > > /**
> > > > > > > > @@ -870,6 +868,7 @@ void dpm_resume_early(pm_message_t state)
> > > > > > > > }
> > > > > > > > mutex_unlock(&dpm_list_mtx);
> > > > > > > > async_synchronize_full();
> > > > > > > > + cpuidle_resume();
> > > > > > > > dpm_show_time(starttime, state, 0, "early");
> > > > > > > > trace_suspend_resume(TPS("dpm_resume_early"), state.event, false);
> > > > > > > > }
> > > > > > > > @@ -1336,8 +1335,6 @@ int dpm_suspend_noirq(pm_message_t state)
> > > > > > > > {
> > > > > > > > int ret;
> > > > > > > >
> > > > > > > > - cpuidle_pause();
> > > > > > > > -
> > > > > > > > device_wakeup_arm_wake_irqs();
> > > > > > > > suspend_device_irqs();
> > > > > > > >
> > > > > > > > @@ -1467,6 +1464,7 @@ int dpm_suspend_late(pm_message_t state)
> > > > > > > > int error = 0;
> > > > > > > >
> > > > > > > > trace_suspend_resume(TPS("dpm_suspend_late"), state.event, true);
> > > > > > > > + cpuidle_pause();
> > > > > > > > mutex_lock(&dpm_list_mtx);
> > > > > > > > pm_transition = state;
> > > > > > > > async_error = 0;
> > > > > > > > --
> > > > > > >
> > > > > > > Well, this is somewhat heavy-handed and it affects even the systems
> > > > > > > that don't really need to pause cpuidle at all in the suspend path.
> > > > > >
> > > > > > Yes, I agree.
> > > > > >
> > > > > > Although, I am not really changing the behaviour in regards to this.
> > > > > > cpuidle_pause() is already being called in dpm_suspend_noirq(), for
> > > > > > everybody today.
> > > > >
> > > > > Yes, it is, but pausing it earlier will cause more energy to be spent,
> > > > > potentially.
> > > > >
> > > > > That said, there are not too many users of suspend_late callbacks in
> > > > > the tree, so it may not matter too much.
> > > > >
> > > > > > >
> > > > > > > Also, IIUC you don't need to pause cpuidle completely, but make it
> > > > > > > temporarily avoid idle states potentially affected by this issue. An
> > > > > > > additional CPUIDLE_STATE_DISABLED_ flag could be used for that I
> > > > > > > suppose and it could be set via cpuidle_suspend() called from the core
> > > > > > > next to cpufreq_suspend().
> > > > > >
> > > > > > cpuidle_suspend() would then need to go and fetch the cpuidle driver
> > > > > > instance, which in some cases is one driver per CPU. Doesn't that get
> > > > > > rather messy?
> > > > >
> > > > > Per-CPU variables are used for that, so it is quite straightforward.
> > > > >
> > > > > > Additionally, since find_deepest_state() is being called for
> > > > > > cpuidle_enter_s2idle() too, we would need to treat the new
> > > > > > CPUIDLE_STATE_DISABLED_ flag in a special way, right?
> > > > >
> > > > > No, it already checks "disabled".
> > > >
> > > > Yes, but that would be wrong.
> > >
> > > Hmmm.
> > >
> > > > The use case I want to support, for cpuidle-psci, is to allow all idle
> > > > states in suspend-to-idle,
> > >
> > > So does PM-runtime work in suspend-to-idle? How?
> >
> > No it doesn't. See below.
> >
> > >
> > > > but prevent those that rely on runtime PM
> > > > (after it has been disabled) for the regular idle path.
> > >
> > > Do you have a special suspend-to-idle handling of those states that
> > > doesn't require PM-runtime?
> >
> > Yes. Feel free to have a look in __psci_enter_domain_idle_state().
>
> So in theory you could check the pm_runtime_put_sync_suspend() return
> value and fall back to something like WFI if that's an error code.
I have already tried that, but it simply got too complicated. The main
issue was that runtime PM could become disabled for the device in the
middle of executing the ->enter() callback.
For example, if pm_runtime_get_sync() fails, I still need to make sure
the reference counting in genpd becomes correct - and I can't do that
using dev_pm_genpd_resume(). That's because it's not designed to be
called in this "unknown" suspend phase, but should be called after the
noirq phase and be properly balanced with dev_pm_genpd_suspend().
In other words, the error path didn't work out for me.
[...]
Kind regards
Uffe
Powered by blists - more mailing lists