[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <919ea7ce-a36a-4eaa-a13a-c693eb9c6c2f@oss.qualcomm.com>
Date: Mon, 25 Aug 2025 21:26:49 +0530
From: Krishna Chaitanya Chundru <krishna.chundru@....qualcomm.com>
To: Marek Szyprowski <m.szyprowski@...sung.com>,
Viresh Kumar <vireshk@...nel.org>
Cc: linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-pci@...r.kernel.org, linux-arm-msm@...r.kernel.org,
Krzysztof Wilczyński <kwilczynski@...nel.org>,
Conor Dooley <conor+dt@...nel.org>, devicetree@...r.kernel.org,
Lorenzo Pieralisi <lpieralisi@...nel.org>, Nishanth Menon <nm@...com>,
Stephen Boyd <sboyd@...nel.org>,
"Rafael J. Wysocki" <rafael@...nel.org>,
Manivannan Sadhasivam <mani@...nel.org>, Rob Herring <robh@...nel.org>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Bjorn Andersson <andersson@...nel.org>,
Krzysztof Kozlowski <krzk+dt@...nel.org>,
Konrad Dybcio <konradybcio@...nel.org>
Subject: Re: [PATCH v4 2/7] OPP: Move refcount and key update for readability
in _opp_table_find_key()
On 8/25/2025 7:29 PM, Marek Szyprowski wrote:
> On 20.08.2025 10:28, Krishna Chaitanya Chundru wrote:
>> Refactor _opp_table_find_key() to improve readability by moving the
>> reference count increment and key update inside the match condition block.
>>
>> Also make the 'assert' check mandatory instead of treating it as optional.
>>
>> Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@....qualcomm.com>
>
> This patch landed in today's linux-next (20250825) as commit
> b5323835f050 ("OPP: Reorganize _opp_table_find_key()"). In my tests I
> found that it causes regressions on my test boards. Reverting this
> change on top of linux-next fixes booting of all the affected boards.
>
> Here are kernel logs with lockdep enabled:
>
> 1. Exynos4412-based Odroid-U3 board (ARM 32bit):
>
> ============================================
> WARNING: possible recursive locking detected
> 6.17.0-rc3-next-20250825 #10901 Not tainted
> --------------------------------------------
> kworker/u16:0/12 is trying to acquire lock:
> cf896040 (&devfreq->lock){+.+.}-{3:3}, at: devfreq_notifier_call+0x30/0x124
>
> but task is already holding lock:
> cf896040 (&devfreq->lock){+.+.}-{3:3}, at: devfreq_monitor+0x1c/0x1a4
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(&devfreq->lock);
> lock(&devfreq->lock);
>
> *** DEADLOCK ***
>
> May be due to missing lock nesting notation
>
> 4 locks held by kworker/u16:0/12:
> #0: c289d0b4 ((wq_completion)devfreq_wq){+.+.}-{0:0}, at:
> process_one_work+0x1b0/0x70c
> #1: f0899f18
> ((work_completion)(&(&devfreq->work)->work)#2){+.+.}-{0:0}, at:
> process_one_work+0x1dc/0x70c
> #2: cf896040 (&devfreq->lock){+.+.}-{3:3}, at: devfreq_monitor+0x1c/0x1a4
> #3: c2e78c4c (&(&opp_table->head)->rwsem){++++}-{3:3}, at:
> blocking_notifier_call_chain+0x28/0x60
>
> stack backtrace:
> CPU: 2 UID: 0 PID: 12 Comm: kworker/u16:0 Not tainted
> 6.17.0-rc3-next-20250825 #10901 PREEMPT
> Hardware name: Samsung Exynos (Flattened Device Tree)
> Workqueue: devfreq_wq devfreq_monitor
> Call trace:
> unwind_backtrace from show_stack+0x10/0x14
> show_stack from dump_stack_lvl+0x68/0x88
> dump_stack_lvl from print_deadlock_bug+0x370/0x380
> print_deadlock_bug from __lock_acquire+0x1428/0x29ec
> __lock_acquire from lock_acquire+0x134/0x388
> lock_acquire from __mutex_lock+0xac/0x10c0
> __mutex_lock from mutex_lock_nested+0x1c/0x24
> mutex_lock_nested from devfreq_notifier_call+0x30/0x124
> devfreq_notifier_call from notifier_call_chain+0x84/0x1d4
> notifier_call_chain from blocking_notifier_call_chain+0x44/0x60
> blocking_notifier_call_chain from _opp_kref_release+0x3c/0x5c
> _opp_kref_release from exynos_bus_target+0x24/0x70
> exynos_bus_target from devfreq_set_target+0x8c/0x2e8
> devfreq_set_target from devfreq_update_target+0x9c/0xf8
> devfreq_update_target from devfreq_monitor+0x28/0x1a4
> devfreq_monitor from process_one_work+0x24c/0x70c
> process_one_work from worker_thread+0x1b8/0x3bc
> worker_thread from kthread+0x13c/0x264
> kthread from ret_from_fork+0x14/0x28
> Exception stack(0xf0899fb0 to 0xf0899ff8)
>
> ...
>
>
> 2. Exynos5422-based Odroid-XU3 board (ARM 32bit):
>
> 8<--- cut here ---
> Unable to handle kernel NULL pointer dereference at virtual address
> 00000000 when read
> [00000000] *pgd=00000000
> Internal error: Oops: 5 [#1] SMP ARM
> Modules linked in:
> CPU: 7 UID: 0 PID: 68 Comm: kworker/u34:1 Not tainted
> 6.17.0-rc3-next-20250825 #10901 PREEMPT
> Hardware name: Samsung Exynos (Flattened Device Tree)
> Workqueue: devfreq_wq devfreq_monitor
> PC is at _opp_compare_key+0x30/0xb4
> LR is at 0xfffffffc
> pc : [<c09831c4>] lr : [<fffffffc>] psr: 20000013
> sp : f0a89de0 ip : cfb0e94c fp : c1574880
> r10: c14095a4 r9 : f0a89e44 r8 : c2a9c010
> r7 : cfb0ea80 r6 : 00000001 r5 : cfb0e900 r4 : 00000001
> r3 : 00000000 r2 : cfb0e900 r1 : cfb0ea80 r0 : cfaf5800
> Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
> Control: 10c5387d Table: 4000406a DAC: 00000051
> Register r0 information: slab kmalloc-1k start cfaf5800 pointer offset 0
> size 1024
> Register r1 information: slab kmalloc-128 start cfb0ea80 pointer offset
> 0 size 128
> Register r2 information: slab kmalloc-128 start cfb0e900 pointer offset
> 0 size 128
> Register r3 information: NULL pointer
> Register r4 information: non-paged memory
> Register r5 information: slab kmalloc-128 start cfb0e900 pointer offset
> 0 size 128
> Register r6 information: non-paged memory
> Register r7 information: slab kmalloc-128 start cfb0ea80 pointer offset
> 0 size 128
> Register r8 information: slab kmalloc-1k start c2a9c000 pointer offset
> 16 size 1024
> Register r9 information: 2-page vmalloc region starting at 0xf0a88000
> allocated at kernel_clone+0x58/0x3c4
> Register r10 information: non-slab/vmalloc memory
> Register r11 information: non-slab/vmalloc memory
> Register r12 information: slab kmalloc-128 start cfb0e900 pointer offset
> 76 size 128
> Process kworker/u34:1 (pid: 68, stack limit = 0x050eb3d7)
> Stack: (0xf0a89de0 to 0xf0a8a000)
> ..
> Call trace:
> _opp_compare_key from _set_opp+0x78/0x50c
> _set_opp from dev_pm_opp_set_rate+0x15c/0x21c
> dev_pm_opp_set_rate from panfrost_devfreq_target+0x2c/0x3c
> panfrost_devfreq_target from devfreq_set_target+0x8c/0x2e8
> devfreq_set_target from devfreq_update_target+0x9c/0xf8
> devfreq_update_target from devfreq_monitor+0x28/0x1a4
> devfreq_monitor from process_one_work+0x24c/0x70c
> process_one_work from worker_thread+0x1b8/0x3bc
> worker_thread from kthread+0x13c/0x264
> kthread from ret_from_fork+0x14/0x28
> Exception stack(0xf0a89fb0 to 0xf0a89ff8)
> ...
> ---[ end trace 0000000000000000 ]---
>
>
> 3. Qualcomm Technologies, Inc. Robotics RB5(ARM 64bit):
>
> ufshcd-qcom 1d84000.ufshc: freq-table-hz property not specified
> ufshcd-qcom 1d84000.ufshc: ufshcd_populate_vreg: Unable to find
> vdd-hba-supply regulator, assuming enabled
> ufshcd-qcom 1d84000.ufshc: Failed to find OPP for MIN frequency
> ufshcd-qcom 1d84000.ufshc: ufshcd_pltfrm_init: OPP parse failed -34
> ufshcd-qcom 1d84000.ufshc: error -ERANGE: ufshcd_pltfrm_init() failed
> ufshcd-qcom 1d84000.ufshc: probe with driver ufshcd-qcom failed with
> error -34
>
>
>
>> ---
>> drivers/opp/core.c | 14 ++++++--------
>> 1 file changed, 6 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/opp/core.c b/drivers/opp/core.c
>> index a36c3daac39cd0bdd2a1f7e9bad5b92f0c756153..bf49709b8c39271431772924daf0c003b45eec7f 100644
>> --- a/drivers/opp/core.c
>> +++ b/drivers/opp/core.c
>> @@ -544,24 +544,22 @@ static struct dev_pm_opp *_opp_table_find_key(struct opp_table *opp_table,
>> struct dev_pm_opp *temp_opp, *opp = ERR_PTR(-ERANGE);
>>
>> /* Assert that the requirement is met */
>> - if (assert && !assert(opp_table, index))
>> + if (!assert(opp_table, index))
>> return ERR_PTR(-EINVAL);
>>
>> guard(mutex)(&opp_table->lock);
>>
>> list_for_each_entry(temp_opp, &opp_table->opp_list, node) {
>> if (temp_opp->available == available) {
>> - if (compare(&opp, temp_opp, read(temp_opp, index), *key))
>> + if (compare(&opp, temp_opp, read(temp_opp, index), *key)) {
>> + /* Increment the reference count of OPP */
>> + *key = read(opp, index);
>> + dev_pm_opp_get(opp);
>> break;
>> + }
>> }
>> }
>>
>> - /* Increment the reference count of OPP */
>> - if (!IS_ERR(opp)) {
>> - *key = read(opp, index);
>> - dev_pm_opp_get(opp);
>> - }
>> -
>> return opp;
>> }
>>
>>
> Best regards
Thanks Marek for reporting.
Viresh,
looks like for compare_floor we need to iterate to the OPP table till
the OPP key is greater than the target key and return previous OPP.
In that case the updation of the key and dev_pm_opp_get() should be
outside as before. We need to remove this part of the patch.
Thanks & Regards,
Krishna Chaitanya.
Powered by blists - more mailing lists