lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250826060647.7eerg5blx3xho27p@vireshk-i7>
Date: Tue, 26 Aug 2025 11:36:47 +0530
From: Viresh Kumar <viresh.kumar@...aro.org>
To: Marek Szyprowski <m.szyprowski@...sung.com>
Cc: Krishna Chaitanya Chundru <krishna.chundru@....qualcomm.com>,
	Viresh Kumar <vireshk@...nel.org>, Nishanth Menon <nm@...com>,
	Stephen Boyd <sboyd@...nel.org>,
	"Rafael J. Wysocki" <rafael@...nel.org>,
	Manivannan Sadhasivam <mani@...nel.org>,
	Lorenzo Pieralisi <lpieralisi@...nel.org>,
	Krzysztof Wilczyński <kwilczynski@...nel.org>,
	Rob Herring <robh@...nel.org>, Bjorn Helgaas <bhelgaas@...gle.com>,
	Bjorn Andersson <andersson@...nel.org>,
	Konrad Dybcio <konradybcio@...nel.org>,
	Krzysztof Kozlowski <krzk+dt@...nel.org>,
	Conor Dooley <conor+dt@...nel.org>, linux-pm@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org,
	linux-arm-msm@...r.kernel.org, devicetree@...r.kernel.org
Subject: Re: [PATCH v4 2/7] OPP: Move refcount and key update for readability
 in _opp_table_find_key()

Marek,

Thanks for the detailed logs. I would need a little more help from
you.

Can you give this a try over your failing branch (I have already
dropped the patch from my tree for now):

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 81fb7dd7f323..5b24255733b5 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -554,10 +554,10 @@ static struct dev_pm_opp *_opp_table_find_key(struct opp_table *opp_table,
        list_for_each_entry(temp_opp, &opp_table->opp_list, node) {
                if (temp_opp->available == available) {
                        if (compare(&opp, temp_opp, read(temp_opp, index), *key)) {
-                               *key = read(opp, index);
-
-                               /* Increment the reference count of OPP */
-                               dev_pm_opp_get(opp);
+                               if (!IS_ERR(opp)) {
+                                       *key = read(opp, index);
+                                       dev_pm_opp_get(opp);
+                               }
                                break;
                        }
                }

On 25-08-25, 15:59, Marek Szyprowski wrote:
> 1. Exynos4412-based Odroid-U3 board (ARM 32bit):
> 
> ============================================
> WARNING: possible recursive locking detected
> 6.17.0-rc3-next-20250825 #10901 Not tainted
> --------------------------------------------
> kworker/u16:0/12 is trying to acquire lock:
> cf896040 (&devfreq->lock){+.+.}-{3:3}, at: devfreq_notifier_call+0x30/0x124
> 
> but task is already holding lock:
> cf896040 (&devfreq->lock){+.+.}-{3:3}, at: devfreq_monitor+0x1c/0x1a4
> 
> other info that might help us debug this:
>   Possible unsafe locking scenario:
> 
>         CPU0
>         ----
>    lock(&devfreq->lock);
>    lock(&devfreq->lock);
> 
>   *** DEADLOCK ***
> 
>   May be due to missing lock nesting notation
> 
> 4 locks held by kworker/u16:0/12:
>   #0: c289d0b4 ((wq_completion)devfreq_wq){+.+.}-{0:0}, at: 
> process_one_work+0x1b0/0x70c
>   #1: f0899f18 
> ((work_completion)(&(&devfreq->work)->work)#2){+.+.}-{0:0}, at: 
> process_one_work+0x1dc/0x70c
>   #2: cf896040 (&devfreq->lock){+.+.}-{3:3}, at: devfreq_monitor+0x1c/0x1a4
>   #3: c2e78c4c (&(&opp_table->head)->rwsem){++++}-{3:3}, at: 
> blocking_notifier_call_chain+0x28/0x60
> 
> stack backtrace:
> CPU: 2 UID: 0 PID: 12 Comm: kworker/u16:0 Not tainted 
> 6.17.0-rc3-next-20250825 #10901 PREEMPT
> Hardware name: Samsung Exynos (Flattened Device Tree)
> Workqueue: devfreq_wq devfreq_monitor
> Call trace:
>   unwind_backtrace from show_stack+0x10/0x14
>   show_stack from dump_stack_lvl+0x68/0x88
>   dump_stack_lvl from print_deadlock_bug+0x370/0x380
>   print_deadlock_bug from __lock_acquire+0x1428/0x29ec
>   __lock_acquire from lock_acquire+0x134/0x388
>   lock_acquire from __mutex_lock+0xac/0x10c0
>   __mutex_lock from mutex_lock_nested+0x1c/0x24
>   mutex_lock_nested from devfreq_notifier_call+0x30/0x124
>   devfreq_notifier_call from notifier_call_chain+0x84/0x1d4
>   notifier_call_chain from blocking_notifier_call_chain+0x44/0x60
>   blocking_notifier_call_chain from _opp_kref_release+0x3c/0x5c
>   _opp_kref_release from exynos_bus_target+0x24/0x70
>   exynos_bus_target from devfreq_set_target+0x8c/0x2e8
>   devfreq_set_target from devfreq_update_target+0x9c/0xf8
>   devfreq_update_target from devfreq_monitor+0x28/0x1a4
>   devfreq_monitor from process_one_work+0x24c/0x70c
>   process_one_work from worker_thread+0x1b8/0x3bc
>   worker_thread from kthread+0x13c/0x264
>   kthread from ret_from_fork+0x14/0x28
> Exception stack(0xf0899fb0 to 0xf0899ff8)

I guess there is some issue with devfreq here which showed up because
we tried to do a dev_pm_opp_get() for an invalid opp (which very well
could have been valid here anyway). This was always done with the OPP
table's lock anyway, nothing changed after the commit AFAICT.

> 2. Exynos5422-based Odroid-XU3 board (ARM 32bit):
> 
> 8<--- cut here ---
> Unable to handle kernel NULL pointer dereference at virtual address 
> 00000000 when read
> [00000000] *pgd=00000000
> Internal error: Oops: 5 [#1] SMP ARM
> Modules linked in:
> CPU: 7 UID: 0 PID: 68 Comm: kworker/u34:1 Not tainted 
> 6.17.0-rc3-next-20250825 #10901 PREEMPT
> Hardware name: Samsung Exynos (Flattened Device Tree)
> Workqueue: devfreq_wq devfreq_monitor
> PC is at _opp_compare_key+0x30/0xb4
> LR is at 0xfffffffc
> pc : [<c09831c4>]    lr : [<fffffffc>]    psr: 20000013
> sp : f0a89de0  ip : cfb0e94c  fp : c1574880
> r10: c14095a4  r9 : f0a89e44  r8 : c2a9c010
> r7 : cfb0ea80  r6 : 00000001  r5 : cfb0e900  r4 : 00000001
> r3 : 00000000  r2 : cfb0e900  r1 : cfb0ea80  r0 : cfaf5800
> Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> Control: 10c5387d  Table: 4000406a  DAC: 00000051
> Register r0 information: slab kmalloc-1k start cfaf5800 pointer offset 0 
> size 1024
> Register r1 information: slab kmalloc-128 start cfb0ea80 pointer offset 
> 0 size 128
> Register r2 information: slab kmalloc-128 start cfb0e900 pointer offset 
> 0 size 128
> Register r3 information: NULL pointer
> Register r4 information: non-paged memory
> Register r5 information: slab kmalloc-128 start cfb0e900 pointer offset 
> 0 size 128
> Register r6 information: non-paged memory
> Register r7 information: slab kmalloc-128 start cfb0ea80 pointer offset 
> 0 size 128
> Register r8 information: slab kmalloc-1k start c2a9c000 pointer offset 
> 16 size 1024
> Register r9 information: 2-page vmalloc region starting at 0xf0a88000 
> allocated at kernel_clone+0x58/0x3c4
> Register r10 information: non-slab/vmalloc memory
> Register r11 information: non-slab/vmalloc memory
> Register r12 information: slab kmalloc-128 start cfb0e900 pointer offset 
> 76 size 128
> Process kworker/u34:1 (pid: 68, stack limit = 0x050eb3d7)
> Stack: (0xf0a89de0 to 0xf0a8a000)
> ..
> Call trace:
>   _opp_compare_key from _set_opp+0x78/0x50c
>   _set_opp from dev_pm_opp_set_rate+0x15c/0x21c
>   dev_pm_opp_set_rate from panfrost_devfreq_target+0x2c/0x3c
>   panfrost_devfreq_target from devfreq_set_target+0x8c/0x2e8
>   devfreq_set_target from devfreq_update_target+0x9c/0xf8
>   devfreq_update_target from devfreq_monitor+0x28/0x1a4
>   devfreq_monitor from process_one_work+0x24c/0x70c
>   process_one_work from worker_thread+0x1b8/0x3bc
>   worker_thread from kthread+0x13c/0x264
>   kthread from ret_from_fork+0x14/0x28
> Exception stack(0xf0a89fb0 to 0xf0a89ff8)

I don't fully understand why this happened yet.

> ---[ end trace 0000000000000000 ]---
> 
> 
> 3. Qualcomm Technologies, Inc. Robotics RB5(ARM 64bit):
> 
> ufshcd-qcom 1d84000.ufshc: freq-table-hz property not specified
> ufshcd-qcom 1d84000.ufshc: ufshcd_populate_vreg: Unable to find 
> vdd-hba-supply regulator, assuming enabled
> ufshcd-qcom 1d84000.ufshc: Failed to find OPP for MIN frequency
> ufshcd-qcom 1d84000.ufshc: ufshcd_pltfrm_init: OPP parse failed -34
> ufshcd-qcom 1d84000.ufshc: error -ERANGE: ufshcd_pltfrm_init() failed
> ufshcd-qcom 1d84000.ufshc: probe with driver ufshcd-qcom failed with 
> error -34

This too.

-- 
viresh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ