[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YImEMN/POW5C8lG7@hovoldconsulting.com>
Date: Wed, 28 Apr 2021 17:50:08 +0200
From: Johan Hovold <johan@...nel.org>
To: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>
Cc: Shawn Tu <shawnx.tu@...el.com>,
Ricardo Ribalda <ribalda@...nel.org>,
Dafna Hirschfeld <dafna.hirschfeld@...labora.com>,
Heiko Stuebner <heiko@...ech.de>, linuxarm@...wei.com,
Todor Tomov <todor.too@...il.com>,
Bjorn Andersson <bjorn.andersson@...aro.org>,
Andrzej Hajda <a.hajda@...sung.com>,
"Lad, Prabhakar" <prabhakar.csengg@...il.com>,
Thierry Reding <thierry.reding@...il.com>,
Pengutronix Kernel Team <kernel@...gutronix.de>,
Dmitry Osipenko <digetx@...il.com>,
linux-stm32@...md-mailman.stormreply.com,
Andrzej Pietrasiewicz <andrzejtp2010@...il.com>,
Leon Luo <leonl@...pardimaging.com>,
Paul Kocialkowski <paul.kocialkowski@...tlin.com>,
Mauro Carvalho Chehab <mchehab@...nel.org>,
Dave Stevenson <dave.stevenson@...pberrypi.com>,
Matt Ranostay <matt.ranostay@...sulko.com>,
Krzysztof Kozlowski <krzk@...nel.org>,
Jonathan Hunter <jonathanh@...dia.com>,
linux-rockchip@...ts.infradead.org, Chen-Yu Tsai <wens@...e.org>,
Andy Gross <agross@...nel.org>,
Matthias Brugger <matthias.bgg@...il.com>,
Dongchun Zhu <dongchun.zhu@...iatek.com>,
Sakari Ailus <sakari.ailus@...ux.intel.com>,
Bingbu Cao <bingbu.cao@...el.com>,
Marek Szyprowski <m.szyprowski@...sung.com>,
Shunqian Zheng <zhengsq@...k-chips.com>,
Tianshu Qiu <tian.shu.qiu@...el.com>,
NXP Linux Team <linux-imx@....com>,
Philipp Zabel <p.zabel@...gutronix.de>,
devel@...verdev.osuosl.org, Jacopo Mondi <jacopo@...ndi.org>,
Sylwester Nawrocki <s.nawrocki@...sung.com>,
linux-tegra@...r.kernel.org,
Alexandre Torgue <alexandre.torgue@...com>,
Wenyou Yang <wenyou.yang@...rochip.com>,
Manivannan Sadhasivam <mani@...nel.org>,
linux-arm-msm@...r.kernel.org,
Sascha Hauer <s.hauer@...gutronix.de>,
Steve Longerbeam <slongerbeam@...il.com>,
linux-media@...r.kernel.org, Maxime Ripard <mripard@...nel.org>,
Stanimir Varbanov <stanimir.varbanov@...aro.org>,
Benoit Parrot <bparrot@...com>,
Helen Koike <helen.koike@...labora.com>,
linux-samsung-soc@...r.kernel.org,
linux-mediatek@...ts.infradead.org,
Jacek Anaszewski <jacek.anaszewski@...il.com>,
mauro.chehab@...wei.com,
Sylwester Nawrocki <sylvester.nawrocki@...il.com>,
"Paul J. Murphy" <paul.j.murphy@...el.com>,
Ezequiel Garcia <ezequiel@...labora.com>,
Daniele Alessandrelli <daniele.alessandrelli@...el.com>,
Chiranjeevi Rapolu <chiranjeevi.rapolu@...el.com>,
linux-arm-kernel@...ts.infradead.org,
Jacob Chen <jacob-chen@...wrt.com>,
Jernej Skrabec <jernej.skrabec@...l.net>,
Hyungwoo Yang <hyungwoo.yang@...el.com>,
linux-kernel@...r.kernel.org, Robert Foss <robert.foss@...aro.org>,
Dan Scally <djrscally@...il.com>,
Sowjanya Komatineni <skomatineni@...dia.com>,
Maxime Coquelin <mcoquelin.stm32@...il.com>,
linux-renesas-soc@...r.kernel.org, Yong Zhi <yong.zhi@...el.com>,
Shawn Guo <shawnguo@...nel.org>
Subject: Re: [PATCH v4 00/79] Address some issues with PM runtime at media
subsystem
On Wed, Apr 28, 2021 at 04:51:21PM +0200, Mauro Carvalho Chehab wrote:
> During the review of the patches from unm.edu, one of the patterns
> I noticed is the amount of patches trying to fix pm_runtime_get_sync()
> calls.
>
> After analyzing the feedback from version 1 of this series, I noticed
> a few other weird behaviors at the PM runtime resume code. So, this
> series start addressing some bugs and issues at the current code.
> Then, it gets rid of pm_runtime_get_sync() at the media subsystem
> (with 2 exceptions).
>
> It should be noticed that
> Commit dd8088d5a896 ("PM: runtime: Add pm_runtime_resume_and_get to deal with usage counter")
> added a new method to does a pm_runtime get, which increments
> the usage count only on success.
>
> The rationale of getting rid of pm_runtime_get_sync() is:
>
> 1. despite its name, this is actually a PM runtime resume call,
> but some developers didn't seem to realize that, as I got this
> pattern on some drivers:
>
> pm_runtime_get_sync(&client->dev);
> pm_runtime_disable(&client->dev);
> pm_runtime_set_suspended(&client->dev);
> pm_runtime_put_noidle(&client->dev);
>
> It makes no sense to resume PM just to suspend it again ;-)
This is perfectly alright. Take a look at ov7740_remove() for example:
pm_runtime_get_sync(&client->dev);
pm_runtime_disable(&client->dev);
pm_runtime_set_suspended(&client->dev);
pm_runtime_put_noidle(&client->dev);
ov7740_set_power(ov7740, 0);
There's an explicit power-off after balancing the PM count and this will
work regardless of the power state when entering this function.
So this has nothing to do with pm_runtime_get_sync() per se.
> 2. Usual *_get() methods only increment their use count on success,
> but pm_runtime_get_sync() increments it unconditionally. Due to
> that, several drivers were mistakenly not calling
> pm_runtime_put_noidle() when it fails;
Sure, but pm_runtime_get_async() also works this way. You just won't be
notified if the async resume fails.
> 3. The name of the new variant is a lot clearer:
> pm_runtime_resume_and_get()
> As its same clearly says that this is a PM runtime resume function,
> that also increments the usage counter on success;
It also introduced an inconsistency in the API and does not pair as well
with the pm_runtime_put variants.
> 4. Consistency: we did similar changes subsystem wide with
> for instance strlcpy() and strcpy() that got replaced by
> strscpy(). Having all drivers using the same known-to-be-safe
> methods is a good thing;
It's not known to be safe; there are ways to get also this interface
wrong as for example this series has shown.
> 5. Prevent newer drivers to copy-and-paste a code that it would
> be easier to break if they don't truly understand what's behind
> the scenes.
Cargo-cult programming always runs that risk.
> This series replace places pm_runtime_get_sync(), by calling
> pm_runtime_resume_and_get() instead.
>
> This should help to avoid future mistakes like that, as people
> tend to use the existing drivers as examples for newer ones.
The only valid point about and use for pm_runtime_resume_and_get() is to
avoid leaking a PM usage count reference in the unlikely case that
resume fails (something which hardly any driver implements recovery
from anyway).
It's a convenience wrapper that saves you from writing one extra line in
some cases (depending on how you implement runtime-pm support) and not a
silver bullet against bugs.
> compile-tested only.
>
> Patches 1 to 7 fix some issues that already exists at the current
> PM runtime code;
>
> patches 8 to 20 fix some usage_count problems that still exists
> at the media subsystem;
>
> patches 21 to 78 repaces pm_runtime_get_sync() by
> pm_runtime_resume_and_get();
>
> Patch 79 (and a hunk on patch 78) documents the two exceptions
> where pm_runtime_get_sync() will still be used for now.
>
> ---
>
> v4:
> - Added a couple of additional fixes at existing PM runtime code;
> - Some patches are now more conservative in order to avoid causing
> regressions.
> v3:
> - fix a compilation error;
> v2:
> - addressed pointed issues and fixed a few other PM issues.
This really doesn't say much more than "changed stuff" so kinda hard to
track if review feedback has been taken into account for example.
Johan
Powered by blists - more mailing lists