lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250326-cross-lock-dep-v1-0-3199e49e8652@bootlin.com>
Date: Wed, 26 Mar 2025 19:26:15 +0100
From: Miquel Raynal <miquel.raynal@...tlin.com>
To: "Rafael J. Wysocki" <rafael@...nel.org>, Pavel Machek <pavel@....cz>, 
 Len Brown <len.brown@...el.com>, 
 Greg Kroah-Hartman <gregkh@...uxfoundation.org>, 
 Danilo Krummrich <dakr@...nel.org>, 
 Michael Turquette <mturquette@...libre.com>, 
 Stephen Boyd <sboyd@...nel.org>
Cc: Thomas Petazzoni <thomas.petazzoni@...tlin.com>, 
 linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org, 
 linux-clk@...r.kernel.org, Chen-Yu Tsai <wenst@...omium.org>, 
 Lucas Stach <l.stach@...gutronix.de>, 
 Laurent Pinchart <laurent.pinchart@...asonboard.com>, 
 Marek Vasut <marex@...x.de>, Ulf Hansson <ulf.hansson@...aro.org>, 
 Kevin Hilman <khilman@...nel.org>, Fabio Estevam <festevam@...x.de>, 
 Jacky Bai <ping.bai@....com>, Peng Fan <peng.fan@....com>, 
 Shawn Guo <shawnguo@...nel.org>, Shengjiu Wang <shengjiu.wang@....com>, 
 linux-imx@....com, Ian Ray <ian.ray@...ealthcare.com>, 
 Hervé Codina <herve.codina@...tlin.com>, 
 Luca Ceresoli <luca.ceresoli@...tlin.com>, 
 Saravana Kannan <saravanak@...gle.com>, 
 Miquel Raynal <miquel.raynal@...tlin.com>
Subject: [PATCH RFC 00/10] Fix the ABBA locking situation between clk and
 runtime PM

As explained in the following thread, there is a known ABBA locking
dependency between clk and runtime PM.
Link: https://lore.kernel.org/linux-clk/20240527181928.4fc6b5f0@xps-13/

The problem is that the clk subsystem uses a mutex to protect concurrent
accesses to its tree structure, and so do other subsystems such as
generic power domains. While it holds its own mutex, the clk subsystem
performs runtime PM calls which end up executing callbacks from other
subsystems (again, gen PD is in the loop). But typically power domains
may also need to perform clock related operations, and thus the
following two situations may happen:

mutex_lock(clk);
mutex_lock(genpd);

or

mutex_lock(genpd);
mutex_lock(clk);

As of today I know that at least NXP i.MX8MP and MediaTek MT8183 SoCs
are complex enough to face this kind of issues.

There's been a first workaround to "silence" lockdep with the most
obvious case triggering the warning: making sure all clocks are RPM
enabled before running the clk_disable_unused() work, but this is just
addressing one situation among many other potentially problematic
situations. In the past, both Laurent Pinchart and Marek Vasut have
experienced these issues when enabling HDMI and audio support,
respectively.

Following a discussion we had at last Plumbers with Steven, I am
proposing to decouple both locks by changing a bit the clk approach:
let's always runtime resume all clocks that we *might* need before
taking the clock lock. But how do we know the list? Well, depending on
the situation we may either need to wake up:
- the upper part of the tree during prepare/unprepare operations.
- the lower part of the tree during (read) rate operations.
- the upper part and the lower part of the tree otherwise (especially
  during rate changes which may involve reparenting).

Luckily, we do not need to do that by hand, are more importantly we do
not need to use the clock tree for that because thanks to the work from
Saravana, we already have device links describing exhaustively the
consumer/supplier relationships. The clock topology (from a runtime PM
perspective) is reflected in these links. In practice, we do not care
about all consumers, but the few clock operations that will actually
trigger runtime PM operations are probably not impacting enough to
justify something more complex.

So here it is: every patch in this series decouples the two locks in
various places of the clock subsystem, until we reach a point where all
needed clocks will always be resumed before acquiring the core lock. It
obviously requires a few new helpers in the RPM core which may probably
be enhanced, I've tried to keep them as simple and straightforward as
possible.

Signed-off-by: Miquel Raynal <miquel.raynal@...tlin.com>
---
Miquel Raynal (10):
      PM: runtime: Add helpers to resume consumers
      clk: Improve comments with usual punctuation
      clk: Avoid non needed runtime PM calls
      clk: Avoid open coded-logic when a there is a helper available
      clk: Move runtime PM calls out of the prepare_lock in clk_init()
      clk: Move runtime PM calls out of the prepare_lock in clk_prepare()
      clk: Ensure all RPM enabled clocks are enabled before reparenting orphans
      clk: Move runtime PM calls out of the prepare_lock in clk_unregister()
      clk: Make sure clock parents and children are resumed when necessary
      clk: Fix the ABBA locking issue with runtime PM subcalls

 drivers/base/power/runtime.c |  54 ++++++++++++
 drivers/clk/clk.c            | 204 +++++++++++++++++++++++++++++++------------
 include/linux/pm_runtime.h   |   2 +
 3 files changed, 204 insertions(+), 56 deletions(-)
---
base-commit: ab6df33805e6e6e4ac1a519cfcade3f7f19f6ff1
change-id: 20250307-cross-lock-dep-e65f285ce8e1

Best regards,
-- 
Miquel Raynal <miquel.raynal@...tlin.com>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ