lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPDyKFpWzLdKr0bYX4VYwNpPuJNEs=weEpNpDH6zfv9e8SaxJQ@mail.gmail.com>
Date:   Tue, 5 Oct 2021 10:45:38 +0200
From:   Ulf Hansson <ulf.hansson@...aro.org>
To:     Dmitry Osipenko <digetx@...il.com>
Cc:     "Rafael J. Wysocki" <rafael@...nel.org>,
        Thierry Reding <thierry.reding@...il.com>,
        Jonathan Hunter <jonathanh@...dia.com>,
        Viresh Kumar <vireshk@...nel.org>,
        Stephen Boyd <sboyd@...nel.org>,
        Peter De Schrijver <pdeschrijver@...dia.com>,
        Mikko Perttunen <mperttunen@...dia.com>,
        Peter Chen <peter.chen@...nel.org>,
        Lee Jones <lee.jones@...aro.org>,
        Uwe Kleine-König 
        <u.kleine-koenig@...gutronix.de>, Nishanth Menon <nm@...com>,
        Adrian Hunter <adrian.hunter@...el.com>,
        Michael Turquette <mturquette@...libre.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-tegra <linux-tegra@...r.kernel.org>,
        Linux PM <linux-pm@...r.kernel.org>,
        Linux USB List <linux-usb@...r.kernel.org>,
        linux-staging@...ts.linux.dev, linux-pwm@...r.kernel.org,
        linux-mmc <linux-mmc@...r.kernel.org>,
        dri-devel <dri-devel@...ts.freedesktop.org>,
        DTML <devicetree@...r.kernel.org>,
        linux-clk <linux-clk@...r.kernel.org>,
        Mark Brown <broonie@...nel.org>,
        Vignesh Raghavendra <vigneshr@...com>,
        Richard Weinberger <richard@....at>,
        Miquel Raynal <miquel.raynal@...tlin.com>,
        Lucas Stach <dev@...xeye.de>, Stefan Agner <stefan@...er.ch>,
        Mauro Carvalho Chehab <mchehab@...nel.org>,
        David Heidelberg <david@...t.cz>
Subject: Re: [PATCH v13 13/35] drm/tegra: gr2d: Support generic power domain
 and runtime PM

On Mon, 4 Oct 2021 at 17:57, Dmitry Osipenko <digetx@...il.com> wrote:
>
> 04.10.2021 14:01, Ulf Hansson пишет:
> > On Fri, 1 Oct 2021 at 21:00, Dmitry Osipenko <digetx@...il.com> wrote:
> >>
> >> 01.10.2021 17:55, Ulf Hansson пишет:
> >>> On Fri, 1 Oct 2021 at 16:29, Dmitry Osipenko <digetx@...il.com> wrote:
> >>>>
> >>>> 01.10.2021 16:39, Ulf Hansson пишет:
> >>>>> On Mon, 27 Sept 2021 at 00:42, Dmitry Osipenko <digetx@...il.com> wrote:
> >>>>>>
> >>>>>> Add runtime power management and support generic power domains.
> >>>>>>
> >>>>>> Tested-by: Peter Geis <pgwipeout@...il.com> # Ouya T30
> >>>>>> Tested-by: Paul Fertser <fercerpav@...il.com> # PAZ00 T20
> >>>>>> Tested-by: Nicolas Chauvet <kwizart@...il.com> # PAZ00 T20 and TK1 T124
> >>>>>> Tested-by: Matt Merhar <mattmerhar@...tonmail.com> # Ouya T30
> >>>>>> Signed-off-by: Dmitry Osipenko <digetx@...il.com>
> >>>>>> ---
> >>>>>>  drivers/gpu/drm/tegra/gr2d.c | 155 +++++++++++++++++++++++++++++++++--
> >>>>>
> >>>>> [...]
> >>>>>
> >>>>>>  static int gr2d_remove(struct platform_device *pdev)
> >>>>>> @@ -259,15 +312,101 @@ static int gr2d_remove(struct platform_device *pdev)
> >>>>>>                 return err;
> >>>>>>         }
> >>>>>>
> >>>>>> +       pm_runtime_dont_use_autosuspend(&pdev->dev);
> >>>>>> +       pm_runtime_disable(&pdev->dev);
> >>>>>
> >>>>> There is no guarantee that the ->runtime_suspend() has been invoked
> >>>>> here, which means that clock may be left prepared/enabled beyond this
> >>>>> point.
> >>>>>
> >>>>> I suggest you call pm_runtime_force_suspend(), instead of
> >>>>> pm_runtime_disable(), to make sure that gets done.
> >>>>
> >>>> The pm_runtime_disable() performs the final synchronization, please see [1].
> >>>>
> >>>> [1]
> >>>> https://elixir.bootlin.com/linux/v5.15-rc3/source/drivers/base/power/runtime.c#L1412
> >>>
> >>> pm_runtime_disable() end up calling _pm_runtime_barrier(), which calls
> >>> cancel_work_sync() if dev->power.request_pending has been set.
> >>>
> >>> If the work that was punted to the pm_wq in rpm_idle() has not been
> >>> started yet, we end up just canceling it. In other words, there are no
> >>> guarantees it runs to completion.
> >>
> >> You're right. Although, in a case of this particular patch, the syncing
> >> is actually implicitly done by pm_runtime_dont_use_autosuspend().
> >>
> >> But for drivers which don't use auto-suspend, there is no sync. This
> >> looks like a disaster, it's a very common pattern for drivers to
> >> 'put+disable'.
> >>
> >>> Moreover, use space may have bumped the usage count via sysfs for the
> >>> device (pm_runtime_forbid()) to keep the device runtime resumed.
> >>
> >> Right, this is also a disaster in a case of driver removal.
> >>
> >>>> Calling pm_runtime_force_suspend() isn't correct because each 'enable'
> >>>> must have the corresponding 'disable'. Hence there is no problem here.
> >>>
> >>> pm_runtime_force_suspend() calls pm_runtime_disable(), so I think that
> >>> should be fine. No?
> >>
> >> [adding Rafael]
> >>
> >> Rafael, could you please explain how drivers are supposed to properly
> >> suspend and disable RPM to cut off power and reset state that was
> >> altered by the driver's resume callback? What we're missing? Is Ulf's
> >> suggestion acceptable?
> >>
> >> The RPM state of a device is getting reset on driver's removal, hence
> >> all refcounts that were bumped by the rpm-resume callback of the device
> >> driver will be screwed up if device is kept resumed after removal. I
> >> just verified that it's true in practice.
> >
> > Note that, what makes the Tegra drivers a bit special is that they are
> > always built with CONFIG_PM being set (selected from the "SoC"
> > Kconfig).
> >
> > Therefore, pm_runtime_force_suspend() can work for some of these
> > cases. Using this, would potentially avoid the driver from having to
> > runtime resume the device in ->remove(), according to the below
> > generic sequence, which is used in many drivers.
> >
> > pm_runtime_get_sync()
> > clk_disable_unprepare() (+ additional things to turn off the device)
> > pm_runtime_disable()
> > pm_runtime_put_noidle()
>
> It's not a problem to change this patchset. The problem is that if
> you'll grep mainline for 'pm_runtime_disable', you will find that there
> are a lot of drivers in a potential trouble.

Let's start by fixing this patchset, please - then we can consider
what to do with the other cases separately.

>
> I'm proposing that we should change pm_runtime_disable() to perform the
> syncing with this oneliner:
>
> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> index ec94049442b9..5c9f28165824 100644
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -1380,6 +1380,8 @@ EXPORT_SYMBOL_GPL(pm_runtime_barrier);
>   */
>  void __pm_runtime_disable(struct device *dev, bool check_resume)
>  {
> +       flush_work(&dev->power.work);
> +

What about the latency this may introduce? I am not sure that is
acceptable here!?

>         spin_lock_irq(&dev->power.lock);
>
>         if (dev->power.disable_depth > 0) {
>
> Objections?
>
> The sysfs rpm-forbid is a separate problem and it's less troublesome
> since it requires root privileges. It's also not something that
> userspace touches casually. For now I don't know what could be done
> about it.

As I said, the common method to address this problem is to run the
following sequence:

pm_runtime_get_sync()
"power off the device"
pm_runtime_disable()
pm_runtime_put_noidle()

This works even if user space, via sysfs, has triggered a call to
pm_runtime_forbid(). Or doesn't it?

If you don't like it, pm_runtime_force_suspend() should work too, at
least for your cases, I believe.

Kind regards
Uffe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ