lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aKg5j7hkxI2q1x0s@smile.fi.intel.com>
Date: Fri, 22 Aug 2025 12:34:07 +0300
From: Andy Shevchenko <andriy.shevchenko@...ux.intel.com>
To: Jisheng Zhang <jszhang@...nel.org>
Cc: Jarkko Nikula <jarkko.nikula@...ux.intel.com>,
	Mika Westerberg <mika.westerberg@...ux.intel.com>,
	Jan Dabros <jsd@...ihalf.com>, Andi Shyti <andi.shyti@...nel.org>,
	linux-kernel@...r.kernel.org, linux-i2c@...r.kernel.org
Subject: Re: [PATCH 1/2] i2c: designware: Avoid taking clk_prepare mutex in
 PM callbacks

On Fri, Aug 22, 2025 at 12:18:43PM +0300, Andy Shevchenko wrote:
> On Fri, Aug 22, 2025 at 12:32:57AM +0800, Jisheng Zhang wrote:
> > On Thu, Aug 21, 2025 at 04:01:55PM +0300, Andy Shevchenko wrote:
> > > On Thu, Aug 21, 2025 at 03:45:43PM +0300, Jarkko Nikula wrote:
> > > > On 8/20/25 7:33 PM, Jisheng Zhang wrote:
> > > > > On Wed, Aug 20, 2025 at 07:05:42PM +0300, Andy Shevchenko wrote:
> > > > > > On Wed, Aug 20, 2025 at 11:31:24PM +0800, Jisheng Zhang wrote:
> > > > > > > This is unsafe, as the runtime PM callbacks are called from the PM
> > > > > > > workqueue, so this may deadlock when handling an i2c attached clock,
> > > > > > > which may already hold the clk_prepare mutex from another context.
> > > > > > 
> > > > > > Can you be more specific? What is the actual issue in practice?
> > > > > > Do you have traces and lockdep warnings?
> > > > > 
> > > > > Assume we use i2c designware to control any i2c based clks, e.g the
> > > > > clk-si5351.c driver. In its .clk_prepare, we'll get the prepare_lock
> > > > > mutex, then we call i2c adapter to operate the regs, to runtime resume
> > > > > the i2c adapter, we call clk_prepare_enable() which will try to get
> > > > > the prepare_lock mutex again.
> > > > > 
> > > > I'd also like to see the issue here. I'm blind to see what's the relation
> > > > between the clocks managed by the clk-si5351.c and clocks to the
> > > > i2c-designware IP.
> > 
> > The key here is: all clks in the system share the same prepare_lock
> > mutex, so the global prepare_lock mutex is locked by clk-si5351
> > .prepare(), then in this exact .prepare(), the i2c-designware's runtime
> > resume will try to lock the same prepare_lock again due to
> > clk_prepare_enable()
> > can you plz check clk_prepare_lock() in drivers/clk/clk.c?
> > 
> > And if we take a look at other i2c adapters' drivers, we'll see
> > some of them have ever met this issue and already fixed it, such
> > as 
> > 
> > i2c-exynos5, by commit 10ff4c5239a1 ("i2c: exynos5: Fix possible ABBA
> > deadlock by keeping I2C clock prepared")
> > 
> > i2c-imx, by commit d9a22d713acb ("i2c: imx: avoid taking clk_prepare
> > mutex in PM callbacks")

> Why is this an I²C driver problem?

I just read these two and one more referenced from one of the changes.

I do not think this is a correct fix. Seems to me like papering over a special
(corner case). I would agree on this change if and only if the CLK maintainers
tell us that there is no other way.

My understanding is that the I²C clock and client's clocks (when it's a clock
provider) are independent. There should not be such a clash to begin with. The
clock framework should operate on a clock subtrees and not having yet another
Global Kernel Lock.

That said, I think this is a design issue in CLK framework, we should not go and
"fix" all the drivers. Today it's I²C, tomorrow SPI and I³C and so on...
This is not a scalable solution.

Here is formal NAK until it will be worked with CLK maintainers to provide an
agreed roadmap for this(ese) issue(s).

> > > I believe they try to make an example when clk-si5351 is the provider of
> > > the clock to I²C host controller (DesignWare).
> > 
> > Nope, the example case is using i2c host controller to operate the clk-si5351
> 
> Okay, so that chip is controlled over I²C, but how their clocks even related to
> the I²C host controller clock?! I am sorry, I am lost here.
> 
> > > But I'm still not sure about the issues here... Without (even simulated with
> > > specific delay injections) lockdep warnings it would be rather theoretical.
> > 
> > No, it happened in real world.
> 
> Can you provide the asked traces and lockdep warnigns and/or other stuff to see
> what's going on there?

-- 
With Best Regards,
Andy Shevchenko



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ