lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID:
 <VI2PR04MB11147CCEFE4204B852807AAF2E841A@VI2PR04MB11147.eurprd04.prod.outlook.com>
Date: Tue, 1 Jul 2025 03:16:08 +0000
From: Carlos Song <carlos.song@....com>
To: "mturquette@...libre.com" <mturquette@...libre.com>, "sboyd@...nel.org"
	<sboyd@...nel.org>, "rafael@...nel.org" <rafael@...nel.org>,
	"pavel@...nel.org" <pavel@...nel.org>, "len.brown@...el.com"
	<len.brown@...el.com>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	"dakr@...nel.org" <dakr@...nel.org>, Aisheng Dong <aisheng.dong@....com>,
	Andi Shyti <andi.shyti@...nel.org>, "shawnguo@...nel.org"
	<shawnguo@...nel.org>, "s.hauer@...gutronix.de" <s.hauer@...gutronix.de>,
	"kernel@...gutronix.de" <kernel@...gutronix.de>, "festevam@...il.com"
	<festevam@...il.com>, Frank Li <frank.li@....com>
CC: "linux-clk@...r.kernel.org" <linux-clk@...r.kernel.org>,
	"linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>,
	"linux-arm-kernel@...ts.infradead.org"
	<linux-arm-kernel@...ts.infradead.org>, "linux-i2c@...r.kernel.org"
	<linux-i2c@...r.kernel.org>, "imx@...ts.linux.dev" <imx@...ts.linux.dev>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, Bough Chen
	<haibo.chen@....com>, Jun Li <jun.li@....com>
Subject: Dead lock with clock global prepare_lock mutex and device's
 power.runtime_status

Hi, All:

We met the dead lock issue recently and think it should be common issue and not sure how to fix it.

We use gpio-gate-clock clock provider (drivers/clk/clk-gpio.c), gpio is one of i2c gpio expander (drivers/gpio/gpio-pcf857x.c). Our i2c driver enable run time pm (drivers/i2c/busses/i2c-imx-lpi2c.c [1]). System random blocked when at reboot.

The dead lock happen as below call stacks

Task 117                                                Task 120

schedule()
clk_prepare_lock()--> wait prepare_lock(mutex_lock)     schedule() wait for power.runtime_status exit RPM_SUSPENDING
                           ^^^^ A                       ^^^^ B
clk_bulk_unprepare()                                    rpm_resume()
lpi2c_runtime_suspend()                                 pm_runtime_resume_and_get()
...                                                     lpi2c_imx_xfer()
                                                        ...
rpm_suspend() set RPM_SUSPENDING                        pcf857x_set();
                           ^^^^ B                       ...
                                                        clk_prepare_lock() --> hold prepare_lock
                                                        ^^^^ A
                                                        ...


Task 117 set power.runtime_status to RPM_SUSPENDING (A) and wait for task 120 release clock's global prepare mutex (B).

Task 120 hold global prepare mutex (B) and wait for power.runtime_status finish suspend (A).

The root cause is that the scope of global prepare_lock is too big. gpio-gate-clock and lpi2c clock are totally independent.

Although it may not happen at downstream case because [1], there are still have other i2c bus and spi bus, and other bus drivers. clock unprepare is quite common in runtime suspend functions.

[1] upstream driver have not use clk_unprepare in suspend functions.

The full log as below:

INFO: task kworker/2:3:117(T117) is blocked on a mutex likely owned by task kworker/u16:5:120(T120).

[    6.955479][   T73] imx-lpi2c 42530000.i2c: lpi2c_runtime_suspend2
[    6.957437][  T120] imx6q-pcie 4c300000.pcie: config reg[1] 0x60100000 == cpu 0x60100000
[    6.957437][  T120] ; no fixup was ever needed for this devicetree
[    6.964257][  T118] platform regulatory.0: Falling back to sysfs fallback for: regulatory.db
[    6.973579][  T120] imx-lpi2c 42530000.i2c: lpi2c_runtime_resume1
[    7.027143][  T120] imx-lpi2c 42530000.i2c: lpi2c_runtime_resume2 0
[    7.033984][  T120] -----------pcf857x_set in
[    7.038373][  T120] -----------------pcf857x_output in
[    7.043527][  T120] ----------------- gpio->write in
[    7.048520][  T117] imx-lpi2c 42530000.i2c: lpi2c_runtime_suspend
[    7.054774][  T120] i2c i2c-2: msg[0] w0/r1 0, data[0] is 7f
[    7.060448][  T120] i2c i2c-2: 42530000.i2c: pm_runtime_resume_and_get
[   67.030316][  T118] cfg80211: failed to load regulatory.db
[  244.059129][   T40] INFO: task kworker/2:3:117 blocked for more than 121 seconds.
[  244.066619][   T40]       Not tainted 6.15.0-rc2-next-20250417-06621-g7cd761409c73-dirty #7
[  244.075010][   T40] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  244.083572][   T40] task:kworker/2:3     state:D stack:0     pid:117   tgid:117   ppid:2      task_flags:0x4208060 flags:0x00000008
[  244.095438][   T40] Workqueue: pm pm_runtime_work
[  244.100157][   T40] Call trace:
[  244.103302][   T40]  __switch_to+0xf8/0x1a0 (T)
[  244.107882][   T40]  __schedule+0x418/0xfd8
[  244.112080][   T40]  schedule+0x4c/0x164
[  244.116055][   T40]  schedule_preempt_disabled+0x24/0x40
[  244.121392][   T40]  __mutex_lock+0x1d4/0x580
[  244.125798][   T40]  mutex_lock_nested+0x24/0x30
[  244.130436][   T40]  clk_prepare_lock+0x4c/0xa8
[  244.135018][   T40]  clk_unprepare+0x24/0x44
[  244.139298][   T40]  clk_bulk_unprepare+0x38/0x60
[  244.144048][   T40]  lpi2c_runtime_suspend+0x64/0x9c
[  244.149021][   T40]  pm_generic_runtime_suspend+0x2c/0x44
[  244.154475][   T40]  __rpm_callback+0x48/0x1ec
[  244.158935][   T40]  rpm_callback+0x74/0x80
[  244.163167][   T40]  rpm_suspend+0x104/0x668
[  244.167446][   T40]  pm_runtime_work+0xc8/0xcc
[  244.171939][   T40]  process_one_work+0x214/0x62c
[  244.176650][   T40]  worker_thread+0x1ac/0x34c
[  244.181144][   T40]  kthread+0x144/0x220
[  244.185082][   T40]  ret_from_fork+0x10/0x20
[  244.189435][   T40] INFO: task kworker/2:3:117 is blocked on a mutex likely owned by task kworker/u16:5:120.
[  244.199300][   T40] task:kworker/u16:5   state:D stack:0     pid:120   tgid:120   ppid:2      task_flags:0x4208060 flags:0x00000008
[  244.211164][   T40] Workqueue: async async_run_entry_fn
[  244.216404][   T40] Call trace:
[  244.219587][   T40]  __switch_to+0xf8/0x1a0 (T)
[  244.224127][   T40]  __schedule+0x418/0xfd8
[  244.228358][   T40]  schedule+0x4c/0x164
[  244.232298][   T40]  rpm_resume+0x1c8/0x734
[  244.236531][   T40]  __pm_runtime_resume+0x50/0x98
[  244.241338][   T40]  lpi2c_imx_xfer+0x58/0xe60
[  244.245829][   T40]  __i2c_transfer+0x1c4/0x828
[  244.250377][   T40]  i2c_smbus_xfer_emulated+0x1b8/0x708
[  244.255735][   T40]  __i2c_smbus_xfer+0x1a0/0x6f0
[  244.260447][   T40]  i2c_smbus_xfer+0x98/0x120
[  244.264939][   T40]  i2c_smbus_write_byte+0x2c/0x3c
[  244.269825][   T40]  i2c_write_le8+0x10/0x20
[  244.274152][   T40]  pcf857x_output+0x7c/0xc0
[  244.278527][   T40]  pcf857x_set+0x3c/0x5c
[  244.282672][   T40]  gpiochip_set+0x68/0xc0
[  244.286864][   T40]  gpiod_set_raw_value_commit+0xd4/0x1a0
[  244.292404][   T40]  gpiod_set_value_nocheck+0x34/0x60
[  244.297549][   T40]  gpiod_set_value_cansleep+0x24/0x60
[  244.302821][   T40]  clk_sleeping_gpio_gate_prepare+0x18/0x28
[  244.308582][   T40]  clk_core_prepare+0xbc/0x2a8
[  244.313247][   T40]  clk_prepare+0x28/0x44
[  244.317361][   T40]  clk_bulk_prepare+0x34/0xa0
[  244.321940][   T40]  imx_pcie_host_init+0xe0/0x434
[  244.326747][   T40]  dw_pcie_host_init+0x1b8/0x758
[  244.331587][   T40]  imx_pcie_probe+0x380/0x8e0
[  244.336133][   T40]  platform_probe+0x68/0xd8
[  244.340539][   T40]  really_probe+0xbc/0x2bc
[  244.344817][   T40]  __driver_probe_device+0x78/0x120
[  244.349916][   T40]  driver_probe_device+0x3c/0x160
[  244.354801][   T40]  __device_attach_driver+0xb8/0x140
[  244.359986][   T40]  bus_for_each_drv+0x88/0xe8
[  244.364534][   T40]  __device_attach_async_helper+0xb8/0xdc
[  244.370161][   T40]  async_run_entry_fn+0x34/0xe0
[  244.374882][   T40]  process_one_work+0x214/0x62c
[  244.379634][   T40]  worker_thread+0x1ac/0x34c
[  244.384102][   T40]  kthread+0x144/0x220
[  244.388076][   T40]  ret_from_fork+0x10/0x20

Best Regard
Carlos Song


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ