lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <8dfe4bfff1256c1ceffeab81cd587d0d@kernel.org>
Date: Mon, 14 Apr 2025 18:00:15 -0700
From: Stephen Boyd <sboyd@...nel.org>
To: Danilo Krummrich <dakr@...nel.org>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Len Brown <len.brown@...el.com>, Michael Turquette <mturquette@...libre.com>, Miquel Raynal <miquel.raynal@...tlin.com>, Pavel Machek <pavel@....cz>, Rafael J. Wysocki <rafael@...nel.org>
Cc: Thomas Petazzoni <thomas.petazzoni@...tlin.com>, linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org, linux-clk@...r.kernel.org, Chen-Yu Tsai <wenst@...omium.org>, Lucas Stach <l.stach@...gutronix.de>, Laurent Pinchart <laurent.pinchart@...asonboard.com>, Marek Vasut <marex@...x.de>, Ulf Hansson <ulf.hansson@...aro.org>, Kevin Hilman <khilman@...nel.org>, Fabio Estevam <festevam@...x.de>, Jacky Bai <ping.bai@....com>, Peng Fan <peng.fan@....com>, Shawn Guo <shawnguo@...nel.org>, Shengjiu Wang <shengjiu.wang@....com>, linux-imx@....com, Ian Ray <ian.ray@...ealthcare.com>, Hervé Codina <herve.codina@...tlin.com>, Luca Ceresoli <luca.ceresoli@...tlin.com>, Saravana Kannan <saravanak@...gle.com>, Miquel Raynal <miquel.raynal@...tlin.com>
Subject: Re: [PATCH RFC 00/10] Fix the ABBA locking situation between clk and runtime PM

Quoting Miquel Raynal (2025-03-26 11:26:15)
> As explained in the following thread, there is a known ABBA locking
> dependency between clk and runtime PM.
> Link: https://lore.kernel.org/linux-clk/20240527181928.4fc6b5f0@xps-13/
> 
> The problem is that the clk subsystem uses a mutex to protect concurrent
> accesses to its tree structure, and so do other subsystems such as
> generic power domains. While it holds its own mutex, the clk subsystem
> performs runtime PM calls which end up executing callbacks from other
> subsystems (again, gen PD is in the loop). But typically power domains
> may also need to perform clock related operations, and thus the
> following two situations may happen:
> 
> mutex_lock(clk);
> mutex_lock(genpd);
> 
> or
> 
> mutex_lock(genpd);
> mutex_lock(clk);
> 
> As of today I know that at least NXP i.MX8MP and MediaTek MT8183 SoCs
> are complex enough to face this kind of issues.
> 
> There's been a first workaround to "silence" lockdep with the most
> obvious case triggering the warning: making sure all clocks are RPM
> enabled before running the clk_disable_unused() work, but this is just
> addressing one situation among many other potentially problematic
> situations. In the past, both Laurent Pinchart and Marek Vasut have
> experienced these issues when enabling HDMI and audio support,
> respectively.
> 
> Following a discussion we had at last Plumbers with Steven, I am
> proposing to decouple both locks by changing a bit the clk approach:
> let's always runtime resume all clocks that we *might* need before
> taking the clock lock. But how do we know the list? Well, depending on
> the situation we may either need to wake up:
> - the upper part of the tree during prepare/unprepare operations.
> - the lower part of the tree during (read) rate operations.
> - the upper part and the lower part of the tree otherwise (especially
>   during rate changes which may involve reparenting).

Thanks for taking on this work. This problem is coming up more and more
often.

> 
> Luckily, we do not need to do that by hand, are more importantly we do
> not need to use the clock tree for that because thanks to the work from
> Saravana, we already have device links describing exhaustively the
> consumer/supplier relationships. The clock topology (from a runtime PM
> perspective) is reflected in these links. In practice, we do not care
> about all consumers, but the few clock operations that will actually
> trigger runtime PM operations are probably not impacting enough to
> justify something more complex.

This won't always work, for a couple reasons. First because clk drivers
aren't required to describe their parent clks that are outside the clk
controller by using DT with a 'clocks' property in the clk controller
node. Second because there can be a many to one relationship between a
struct device and struct device_node. We're trying to push drivers to be
written in a way that the binding has the 'clocks' property, but that
isn't always the case, so we still need a solution that works in all
cases so as to not regress old (legacy?) implementations or ones that
divide a platform device into some number of auxiliary devices and
drivers.

One idea to do that would be to implement the device links between clk
controller devices based on all possible parents of the clk. We support
lazily registering clks though, meaning a parent could be registered at
any time, so we would have to explore the clk tree each time a clk is
registered to see if any new clks need to be found and device links made
between devices. The general algorithm is probably something like:

  clk_register()
   make_links_for_node()
    if device node missing 'clocks' property
     for each parent string
      pclk = find parent clk
      pdev = pclk->dev
      link pdev to dev node
    else
     for each clk in clocks property
      pclk = find parent clk
      pdev = pclk->dev
      link pdev to dev node

We have to get the parent clk in all cases because we don't know which
device it may be registered for (the platform device or auxiliary
device). If we optimize here, I'd prefer we optimize for the case where
the 'clocks' property is present to encourage migration. Maybe what we
can do is make some structure for a clk controller and have a linked
list of those that we look up when a new clk is registered. We actually
have that already with 'struct clock_provider' so maybe we need to
extend that.

Stash the device pointer in there and some variable sized array of the
clk_core pointers to the external clks. In the 'clocks' DT property
case, we can size this immediately and map the array index to the
property but in the non-property case we'll have to grow this array each
time a new clk is found to be a parent of the device. Maybe for that we
should just have some other sort of linked list of clk_core pointers
that we continue to stack clks onto.

  struct clock_provider {
    void (*clk_init_cb)(struct device_node *);
    struct device *dev;
    struct device_node *np;
    struct list_head node;
    struct clk_core *legacy_clks; // Or struct list_head legacy?
    size_t len_clocks;
    struct clk_core clocks_property[];
 };

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ