[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250826060647.7eerg5blx3xho27p@vireshk-i7>
Date: Tue, 26 Aug 2025 11:36:47 +0530
From: Viresh Kumar <viresh.kumar@...aro.org>
To: Marek Szyprowski <m.szyprowski@...sung.com>
Cc: Krishna Chaitanya Chundru <krishna.chundru@....qualcomm.com>,
Viresh Kumar <vireshk@...nel.org>, Nishanth Menon <nm@...com>,
Stephen Boyd <sboyd@...nel.org>,
"Rafael J. Wysocki" <rafael@...nel.org>,
Manivannan Sadhasivam <mani@...nel.org>,
Lorenzo Pieralisi <lpieralisi@...nel.org>,
Krzysztof Wilczyński <kwilczynski@...nel.org>,
Rob Herring <robh@...nel.org>, Bjorn Helgaas <bhelgaas@...gle.com>,
Bjorn Andersson <andersson@...nel.org>,
Konrad Dybcio <konradybcio@...nel.org>,
Krzysztof Kozlowski <krzk+dt@...nel.org>,
Conor Dooley <conor+dt@...nel.org>, linux-pm@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org,
linux-arm-msm@...r.kernel.org, devicetree@...r.kernel.org
Subject: Re: [PATCH v4 2/7] OPP: Move refcount and key update for readability
in _opp_table_find_key()
Marek,
Thanks for the detailed logs. I would need a little more help from
you.
Can you give this a try over your failing branch (I have already
dropped the patch from my tree for now):
diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 81fb7dd7f323..5b24255733b5 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -554,10 +554,10 @@ static struct dev_pm_opp *_opp_table_find_key(struct opp_table *opp_table,
list_for_each_entry(temp_opp, &opp_table->opp_list, node) {
if (temp_opp->available == available) {
if (compare(&opp, temp_opp, read(temp_opp, index), *key)) {
- *key = read(opp, index);
-
- /* Increment the reference count of OPP */
- dev_pm_opp_get(opp);
+ if (!IS_ERR(opp)) {
+ *key = read(opp, index);
+ dev_pm_opp_get(opp);
+ }
break;
}
}
On 25-08-25, 15:59, Marek Szyprowski wrote:
> 1. Exynos4412-based Odroid-U3 board (ARM 32bit):
>
> ============================================
> WARNING: possible recursive locking detected
> 6.17.0-rc3-next-20250825 #10901 Not tainted
> --------------------------------------------
> kworker/u16:0/12 is trying to acquire lock:
> cf896040 (&devfreq->lock){+.+.}-{3:3}, at: devfreq_notifier_call+0x30/0x124
>
> but task is already holding lock:
> cf896040 (&devfreq->lock){+.+.}-{3:3}, at: devfreq_monitor+0x1c/0x1a4
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(&devfreq->lock);
> lock(&devfreq->lock);
>
> *** DEADLOCK ***
>
> May be due to missing lock nesting notation
>
> 4 locks held by kworker/u16:0/12:
> #0: c289d0b4 ((wq_completion)devfreq_wq){+.+.}-{0:0}, at:
> process_one_work+0x1b0/0x70c
> #1: f0899f18
> ((work_completion)(&(&devfreq->work)->work)#2){+.+.}-{0:0}, at:
> process_one_work+0x1dc/0x70c
> #2: cf896040 (&devfreq->lock){+.+.}-{3:3}, at: devfreq_monitor+0x1c/0x1a4
> #3: c2e78c4c (&(&opp_table->head)->rwsem){++++}-{3:3}, at:
> blocking_notifier_call_chain+0x28/0x60
>
> stack backtrace:
> CPU: 2 UID: 0 PID: 12 Comm: kworker/u16:0 Not tainted
> 6.17.0-rc3-next-20250825 #10901 PREEMPT
> Hardware name: Samsung Exynos (Flattened Device Tree)
> Workqueue: devfreq_wq devfreq_monitor
> Call trace:
> unwind_backtrace from show_stack+0x10/0x14
> show_stack from dump_stack_lvl+0x68/0x88
> dump_stack_lvl from print_deadlock_bug+0x370/0x380
> print_deadlock_bug from __lock_acquire+0x1428/0x29ec
> __lock_acquire from lock_acquire+0x134/0x388
> lock_acquire from __mutex_lock+0xac/0x10c0
> __mutex_lock from mutex_lock_nested+0x1c/0x24
> mutex_lock_nested from devfreq_notifier_call+0x30/0x124
> devfreq_notifier_call from notifier_call_chain+0x84/0x1d4
> notifier_call_chain from blocking_notifier_call_chain+0x44/0x60
> blocking_notifier_call_chain from _opp_kref_release+0x3c/0x5c
> _opp_kref_release from exynos_bus_target+0x24/0x70
> exynos_bus_target from devfreq_set_target+0x8c/0x2e8
> devfreq_set_target from devfreq_update_target+0x9c/0xf8
> devfreq_update_target from devfreq_monitor+0x28/0x1a4
> devfreq_monitor from process_one_work+0x24c/0x70c
> process_one_work from worker_thread+0x1b8/0x3bc
> worker_thread from kthread+0x13c/0x264
> kthread from ret_from_fork+0x14/0x28
> Exception stack(0xf0899fb0 to 0xf0899ff8)
I guess there is some issue with devfreq here which showed up because
we tried to do a dev_pm_opp_get() for an invalid opp (which very well
could have been valid here anyway). This was always done with the OPP
table's lock anyway, nothing changed after the commit AFAICT.
> 2. Exynos5422-based Odroid-XU3 board (ARM 32bit):
>
> 8<--- cut here ---
> Unable to handle kernel NULL pointer dereference at virtual address
> 00000000 when read
> [00000000] *pgd=00000000
> Internal error: Oops: 5 [#1] SMP ARM
> Modules linked in:
> CPU: 7 UID: 0 PID: 68 Comm: kworker/u34:1 Not tainted
> 6.17.0-rc3-next-20250825 #10901 PREEMPT
> Hardware name: Samsung Exynos (Flattened Device Tree)
> Workqueue: devfreq_wq devfreq_monitor
> PC is at _opp_compare_key+0x30/0xb4
> LR is at 0xfffffffc
> pc : [<c09831c4>] lr : [<fffffffc>] psr: 20000013
> sp : f0a89de0 ip : cfb0e94c fp : c1574880
> r10: c14095a4 r9 : f0a89e44 r8 : c2a9c010
> r7 : cfb0ea80 r6 : 00000001 r5 : cfb0e900 r4 : 00000001
> r3 : 00000000 r2 : cfb0e900 r1 : cfb0ea80 r0 : cfaf5800
> Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
> Control: 10c5387d Table: 4000406a DAC: 00000051
> Register r0 information: slab kmalloc-1k start cfaf5800 pointer offset 0
> size 1024
> Register r1 information: slab kmalloc-128 start cfb0ea80 pointer offset
> 0 size 128
> Register r2 information: slab kmalloc-128 start cfb0e900 pointer offset
> 0 size 128
> Register r3 information: NULL pointer
> Register r4 information: non-paged memory
> Register r5 information: slab kmalloc-128 start cfb0e900 pointer offset
> 0 size 128
> Register r6 information: non-paged memory
> Register r7 information: slab kmalloc-128 start cfb0ea80 pointer offset
> 0 size 128
> Register r8 information: slab kmalloc-1k start c2a9c000 pointer offset
> 16 size 1024
> Register r9 information: 2-page vmalloc region starting at 0xf0a88000
> allocated at kernel_clone+0x58/0x3c4
> Register r10 information: non-slab/vmalloc memory
> Register r11 information: non-slab/vmalloc memory
> Register r12 information: slab kmalloc-128 start cfb0e900 pointer offset
> 76 size 128
> Process kworker/u34:1 (pid: 68, stack limit = 0x050eb3d7)
> Stack: (0xf0a89de0 to 0xf0a8a000)
> ..
> Call trace:
> _opp_compare_key from _set_opp+0x78/0x50c
> _set_opp from dev_pm_opp_set_rate+0x15c/0x21c
> dev_pm_opp_set_rate from panfrost_devfreq_target+0x2c/0x3c
> panfrost_devfreq_target from devfreq_set_target+0x8c/0x2e8
> devfreq_set_target from devfreq_update_target+0x9c/0xf8
> devfreq_update_target from devfreq_monitor+0x28/0x1a4
> devfreq_monitor from process_one_work+0x24c/0x70c
> process_one_work from worker_thread+0x1b8/0x3bc
> worker_thread from kthread+0x13c/0x264
> kthread from ret_from_fork+0x14/0x28
> Exception stack(0xf0a89fb0 to 0xf0a89ff8)
I don't fully understand why this happened yet.
> ---[ end trace 0000000000000000 ]---
>
>
> 3. Qualcomm Technologies, Inc. Robotics RB5(ARM 64bit):
>
> ufshcd-qcom 1d84000.ufshc: freq-table-hz property not specified
> ufshcd-qcom 1d84000.ufshc: ufshcd_populate_vreg: Unable to find
> vdd-hba-supply regulator, assuming enabled
> ufshcd-qcom 1d84000.ufshc: Failed to find OPP for MIN frequency
> ufshcd-qcom 1d84000.ufshc: ufshcd_pltfrm_init: OPP parse failed -34
> ufshcd-qcom 1d84000.ufshc: error -ERANGE: ufshcd_pltfrm_init() failed
> ufshcd-qcom 1d84000.ufshc: probe with driver ufshcd-qcom failed with
> error -34
This too.
--
viresh
Powered by blists - more mailing lists