lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAPDyKFo=j6r+LA06RjD8=v72LK3voK9APR6rNFN+q1Xc_tHn-A@mail.gmail.com>
Date:   Mon, 26 Jun 2023 11:50:54 +0200
From:   Ulf Hansson <ulf.hansson@...aro.org>
To:     Martin Kepplinger <martin.kepplinger@...i.sm>
Cc:     rafael@...nel.org, khilman@...nel.org, robh@...nel.org,
        krzysztof.kozlowski@...aro.org, shawnguo@...nel.org,
        s.hauer@...gutronix.de, festevam@...il.com, pavel@....cz,
        kernel@...i.sm, linux-imx@....com, broonie@...nel.org,
        l.stach@...gutronix.de, aford173@...il.com,
        linux-pm@...r.kernel.org, devicetree@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH v6 1/2] power: domain: handle genpd correctly when needing interrupts

On Wed, 21 Jun 2023 at 20:20, Martin Kepplinger
<martin.kepplinger@...i.sm> wrote:
>
> Am Freitag, dem 23.09.2022 um 15:55 +0200 schrieb Ulf Hansson:
> > On Thu, 25 Aug 2022 at 09:06, Martin Kepplinger
> > <martin.kepplinger@...i.sm> wrote:
> > >
> > > Am Mittwoch, dem 24.08.2022 um 15:30 +0200 schrieb Ulf Hansson:
> > > > On Mon, 22 Aug 2022 at 10:38, Martin Kepplinger
> > > > <martin.kepplinger@...i.sm> wrote:
> > > > >
> > > > > Am Freitag, dem 19.08.2022 um 16:53 +0200 schrieb Ulf Hansson:
> > > > > > On Fri, 19 Aug 2022 at 11:17, Martin Kepplinger
> > > > > > <martin.kepplinger@...i.sm> wrote:
> > > > > > >
> > > > > > > Am Dienstag, dem 26.07.2022 um 17:07 +0200 schrieb Ulf
> > > > > > > Hansson:
> > > > > > > > On Tue, 26 Jul 2022 at 10:33, Martin Kepplinger
> > > > > > > > <martin.kepplinger@...i.sm> wrote:
> > > > > > > > >
> > > > > > > > > If for example the power-domains' power-supply node
> > > > > > > > > (regulator)
> > > > > > > > > needs
> > > > > > > > > interrupts to work, the current setup with noirq
> > > > > > > > > callbacks
> > > > > > > > > cannot
> > > > > > > > > work; for example a pmic regulator on i2c, when
> > > > > > > > > suspending,
> > > > > > > > > usually
> > > > > > > > > already
> > > > > > > > > times out during suspend_noirq:
> > > > > > > > >
> > > > > > > > > [   41.024193] buck4: failed to disable: -ETIMEDOUT
> > > > > > > > >
> > > > > > > > > So fix system suspend and resume for these power-
> > > > > > > > > domains by
> > > > > > > > > using
> > > > > > > > > the
> > > > > > > > > "outer" suspend/resume callbacks instead. Tested on the
> > > > > > > > > imx8mq-
> > > > > > > > > librem5 board,
> > > > > > > > > but by looking at the dts, this will fix imx8mq-evk and
> > > > > > > > > possibly
> > > > > > > > > many other
> > > > > > > > > boards too.
> > > > > > > > >
> > > > > > > > > This is designed so that genpd providers just say "this
> > > > > > > > > genpd
> > > > > > > > > needs
> > > > > > > > > interrupts" (by setting the flag) - without implying an
> > > > > > > > > implementation.
> > > > > > > > >
> > > > > > > > > Initially system suspend problems had been discussed at
> > > > > > > > > https://lore.kernel.org/linux-arm-kernel/20211002005954.1367653-8-l.stach@pengutronix.de/
> > > > > > > > > which led to discussing the pmic that contains the
> > > > > > > > > regulators
> > > > > > > > > which
> > > > > > > > > serve as power-domain power-supplies:
> > > > > > > > > https://lore.kernel.org/linux-pm/573166b75e524517782471c2b7f96e03fd93d175.camel@puri.sm/T/
> > > > > > > > >
> > > > > > > > > Signed-off-by: Martin Kepplinger
> > > > > > > > > <martin.kepplinger@...i.sm>
> > > > > > > > > ---
> > > > > > > > >  drivers/base/power/domain.c | 13 +++++++++++--
> > > > > > > > >  include/linux/pm_domain.h   |  5 +++++
> > > > > > > > >  2 files changed, 16 insertions(+), 2 deletions(-)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/base/power/domain.c
> > > > > > > > > b/drivers/base/power/domain.c
> > > > > > > > > index 5a2e0232862e..58376752a4de 100644
> > > > > > > > > --- a/drivers/base/power/domain.c
> > > > > > > > > +++ b/drivers/base/power/domain.c
> > > > > > > > > @@ -130,6 +130,7 @@ static const struct genpd_lock_ops
> > > > > > > > > genpd_spin_ops = {
> > > > > > > > >  #define genpd_is_active_wakeup(genpd)  (genpd->flags &
> > > > > > > > > GENPD_FLAG_ACTIVE_WAKEUP)
> > > > > > > > >  #define genpd_is_cpu_domain(genpd)     (genpd->flags &
> > > > > > > > > GENPD_FLAG_CPU_DOMAIN)
> > > > > > > > >  #define genpd_is_rpm_always_on(genpd)  (genpd->flags &
> > > > > > > > > GENPD_FLAG_RPM_ALWAYS_ON)
> > > > > > > > > +#define genpd_irq_on(genpd)            (genpd->flags &
> > > > > > > > > GENPD_FLAG_IRQ_ON)
> > > > > > > > >
> > > > > > > > >  static inline bool irq_safe_dev_in_sleep_domain(struct
> > > > > > > > > device
> > > > > > > > > *dev,
> > > > > > > > >                 const struct generic_pm_domain *genpd)
> > > > > > > > > @@ -2065,8 +2066,15 @@ int pm_genpd_init(struct
> > > > > > > > > generic_pm_domain
> > > > > > > > > *genpd,
> > > > > > > > >         genpd->domain.ops.runtime_suspend =
> > > > > > > > > genpd_runtime_suspend;
> > > > > > > > >         genpd->domain.ops.runtime_resume =
> > > > > > > > > genpd_runtime_resume;
> > > > > > > > >         genpd->domain.ops.prepare = genpd_prepare;
> > > > > > > > > -       genpd->domain.ops.suspend_noirq =
> > > > > > > > > genpd_suspend_noirq;
> > > > > > > > > -       genpd->domain.ops.resume_noirq =
> > > > > > > > > genpd_resume_noirq;
> > > > > > > > > +
> > > > > > > > > +       if (genpd_irq_on(genpd)) {
> > > > > > > > > +               genpd->domain.ops.suspend =
> > > > > > > > > genpd_suspend_noirq;
> > > > > > > > > +               genpd->domain.ops.resume =
> > > > > > > > > genpd_resume_noirq;
> > > > > > > > > +       } else {
> > > > > > > > > +               genpd->domain.ops.suspend_noirq =
> > > > > > > > > genpd_suspend_noirq;
> > > > > > > > > +               genpd->domain.ops.resume_noirq =
> > > > > > > > > genpd_resume_noirq;
> > > > > > > >
> > > > > > > > As we discussed previously, I am thinking that it may be
> > > > > > > > better
> > > > > > > > to
> > > > > > > > move to using genpd->domain.ops.suspend_late and
> > > > > > > > genpd->domain.ops.resume_early instead.
> > > > > > >
> > > > > > > Wouldn't that better be a separate patch (on top)? Do you
> > > > > > > really
> > > > > > > want
> > > > > > > me to change the current behaviour (default case) to from
> > > > > > > noirq
> > > > > > > to
> > > > > > > late? Then I'll resend this series with such a patch added.
> > > > > >
> > > > > > Sorry, I wasn't clear enough, the default behaviour should
> > > > > > remain
> > > > > > as
> > > > > > is.
> > > > > >
> > > > > > What I meant was, when genpd_irq_on() is true, we should use
> > > > > > the
> > > > > > genpd->domain.ops.suspend_late and genpd-
> > > > > > > domain.ops.resume_early.
> > > > >
> > > > > Testing that shows that this isn't working. I can provide the
> > > > > logs
> > > > > later, but suspend fails and I think it makes sense:
> > > > > "suspend_late"
> > > > > is
> > > > > simply already too late when i2c (or any needed driver) uses
> > > > > "suspend".
> > > >
> > > > Okay, I see.
> > > >
> > > > The reason why I suggested moving the callbacks to
> > > > "suspend_late",
> > > > was
> > > > that I was worried that some of the attached devices to genpd
> > > > could
> > > > use "suspend_late" themselves. This is the case for some drivers
> > > > for
> > > > DMA/clock/gpio/pinctrl-controllers, for example. That said, I am
> > > > curious to look at the DT files for the platform you are running,
> > > > would you mind giving me a pointer?
> > >
> > > I'm running
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/boot/dts/freescale/imx8mq-librem5.dtsi
> > > with these (small) patches on top:
> > > https://source.puri.sm/martin.kepplinger/linux-next/-/commits/5.19.3/librem5
> >
> > Thanks for sharing the information!
> >
> > >
> > > >
> > > > So, this made me think about this a bit more. In the end, just
> > > > using
> > > > different levels (suspend, suspend_late, suspend_noirq) of
> > > > callbacks
> > > > are just papering over the real *dependency* problem.
> > >
> > > true, it doesn't feel like a stable solution.
> > >
> > > >
> > > > What we need for the genpd provider driver, is to be asked to be
> > > > suspended under the following conditions:
> > > > 1. All consumer devices (and child-domains) for its corresponding
> > > > PM
> > > > domain have been suspended.
> > > > 2. All its supplier devices supplies must remain resumed, until
> > > > the
> > > > genpd provider has been suspended.
> > > >
> > > > Please allow me a few more days to think in more detail about
> > > > this.
> > >
> > > Thanks a lot for thinking about this!
> >
> > I have made some more thinking, but it's been a busy period for me,
> > so
> > unfortunately I need some additional time (another week). It seems
> > like I also need to do some prototyping, to convince myself about the
> > approach.
> >
> > So, my apologies for the delay!
> >
> > Kind regards
> > Uffe
>
> Hi Ulf and all interested,
>
> Has there been any development regarding this bug? - genpd that needs
> interrupts for power-on/off being run in noirq phases - you remember
> it? it's been a while :)

Yes, sorry for the no-progress on my side. Except for some thinking
and drawing, I don't have an update.

Although, to clarify, I have not forgotten about it. It's in my TODO
list of prioritized things. I just need to complete a couple other
things before I come to this and I will certainly keep you in the loop
if I post something.

>
> Anyway I still run these patches and while it's a reasonable workaround
> IMO, I wanted to check whether you are aware of anything that might
> solve this. (or maybe it *is* solved and I simply overlooked because my
> patches still apply?)

The problem is still there, unfortunately.

Kind regards
Uffe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ