[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <kcgqm3ywdzdttllk357szirjdvdddsrcznfuwux6y237ncjnwb@3ots4rqreznu>
Date: Fri, 28 Jun 2024 10:40:23 +0300
From: Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
To: Varadarajan Narayanan <quic_varada@...cinc.com>
Cc: robh@...nel.org, krzk+dt@...nel.org, conor+dt@...nel.org,
angelogioacchino.delregno@...labora.com, andersson@...nel.org, konrad.dybcio@...aro.org,
mturquette@...libre.com, sboyd@...nel.org, ilia.lin@...nel.org, rafael@...nel.org,
viresh.kumar@...aro.org, ulf.hansson@...aro.org, quic_sibis@...cinc.com,
otto.pflueger@...cue.de, neil.armstrong@...aro.org, luca@...tu.xyz, abel.vesa@...aro.org,
danila@...xyga.com, quic_ipkumar@...cinc.com, linux-arm-msm@...r.kernel.org,
devicetree@...r.kernel.org, linux-kernel@...r.kernel.org, linux-clk@...r.kernel.org,
linux-pm@...r.kernel.org
Subject: Re: [PATCH v3 2/9] cpufreq: qcom-nvmem: Add genpd names to
match_data_kryo
On Fri, Jun 28, 2024 at 12:02:48PM GMT, Varadarajan Narayanan wrote:
> On Wed, Jun 26, 2024 at 09:23:17PM +0300, Dmitry Baryshkov wrote:
> > On Wed, Jun 26, 2024 at 04:09:55PM GMT, Varadarajan Narayanan wrote:
> > > This is used for tying up the cpu@N nodes with the power domains.
> > > Without this, 'cat /sys/kernel/debug/qcom_cpr3/thread0'
> > > crashes with NULL pointer access.
> >
> > Add the interesting part of the backtrace, please.
>
> if (thread->drv->desc->cpr_type < CTRL_TYPE_CPRH) {
> seq_printf(s, "current_volt = %d uV\n", thread->drv->last_uV);
> seq_printf(s, "requested voltage: %d uV\n", thread->corner->last_uV);
> }
>
> thread->corner is NULL in the second printf above.
>
> # cat /sys/kernel/debug/qcom_cpr3/thread0
> [ 16.965241] Unable to handle kernel NULL pointer dereference at virtual address 000000000000000c
> [ 16.965270] Mem abort info:
> [ 16.973181] ESR = 0x0000000096000004
> [ 16.975607] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 16.979425] SET = 0, FnV = 0
> [ 16.984889] EA = 0, S1PTW = 0
> [ 16.987756] FSC = 0x04: level 0 translation fault
> [ 16.990792] Data abort info:
> [ 16.995652] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
> [ 16.998779] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [ 17.004074] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [ 17.009196] user pgtable: 4k pages, 48-bit VAs, pgdp=00000000481b1000
> [ 17.014579] [000000000000000c] pgd=0000000000000000, p4d=0000000000000000
> [ 17.020919] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
> [ 17.020921] Modules linked in:
> [ 17.020926] CPU: 0 UID: 0 PID: 118 Comm: cat Not tainted 6.10.0-rc4-next-20240620-00020-g125eb3184fc1-dirty #9
> [ 17.020931] Hardware name: Qualcomm Technologies, Inc. IPQ9574/AP-AL02-C7 (DT)
> [ 17.020933] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 17.020936] pc : cpr3_debug_info_show+0x3a0/0x3ac
> [ 17.020945] lr : cpr3_debug_info_show+0x390/0x3ac
> [ 17.020948] sp : ffff800086293b90
> [ 17.020949] x29: ffff800086293b90 x28: ffff0000034ae038 x27: 0000000000400cc0
> [ 17.020953] x26: 000000007ffff000 x25: ffff0000034ae028 x24: 0000000000000000
> [ 17.020957] x23: ffff800086293c80 x22: ffff000002399880 x21: ffff000002a8fa80
> [ 17.020960] x20: ffff0000034ae000 x19: 0000000000000000 x18: ffffffffffffffff
> [ 17.020964] x17: 0000000000000000 x16: 0000000000000000 x15: ffff800086293a40
> [ 17.020967] x14: ffff000002913000 x13: ffff00000291200f x12: 0000000000000000
> [ 17.020970] x11: 0000000000000000 x10: 0000000000000000 x9 : ffff0000034a9000
> [ 17.020973] x8 : 000000000a567520 x7 : 0000000000000001 x6 : 000000000a567520
> [ 17.020976] x5 : ffff000002912014 x4 : ffff800080e1f3a5 x3 : 0000000000000014
> [ 17.020979] x2 : 0000000000000000 x1 : ffff800080e1f848 x0 : ffff0000034ae000
> [ 17.020983] Call trace:
> [ 17.020984] cpr3_debug_info_show+0x3a0/0x3ac
> [ 17.020987] seq_read_iter+0xe0/0x45c
> [ 17.020993] seq_read+0xec/0x130
> [ 17.020996] full_proxy_read+0x60/0xb4
> [ 17.020999] vfs_read+0xc0/0x31c
> [ 17.021003] ksys_read+0x70/0x104
> [ 17.021006] __arm64_sys_read+0x1c/0x28
> [ 17.021008] invoke_syscall+0x48/0x114
> [ 17.021014] el0_svc_common+0x3c/0xe8
> [ 17.021017] do_el0_svc+0x20/0x2c
> [ 17.021020] el0_svc+0x34/0xd8
> [ 17.021024] el0t_64_sync_handler+0x120/0x12c
> [ 17.021027] el0t_64_sync+0x190/0x194
> [ 17.021031] Code: f94012c2 aa1403e0 b0004701 91212021 (b9400c42)
> [ 17.021033] ---[ end trace 0000000000000000 ]---
> Segmentation fault
Well, I asked to add it, so please drop the timestamps and registers and
include it into the commit message. While you are at it, please review
Documentation/process/submitting-patches.rst and change the commit
message to follow the guidelines.
While doing so (and while responding to a comment below) you will notice
that this change should not be applied to the generic match_data_kryo
instance. Instead you should add a device-specific entry into match
table and use struct qcom_cpufreq_match_data instance that has
.genpd_names set.
>
> > > Signed-off-by: Varadarajan Narayanan <quic_varada@...cinc.com>
> > > ---
> > > drivers/cpufreq/qcom-cpufreq-nvmem.c | 1 +
> > > 1 file changed, 1 insertion(+)
> > >
> > > diff --git a/drivers/cpufreq/qcom-cpufreq-nvmem.c b/drivers/cpufreq/qcom-cpufreq-nvmem.c
> > > index 939702dfa73f..5e6525c7788c 100644
> > > --- a/drivers/cpufreq/qcom-cpufreq-nvmem.c
> > > +++ b/drivers/cpufreq/qcom-cpufreq-nvmem.c
> > > @@ -399,6 +399,7 @@ static const char *generic_genpd_names[] = { "perf", NULL };
> > >
> > > static const struct qcom_cpufreq_match_data match_data_kryo = {
> > > .get_version = qcom_cpufreq_kryo_name_version,
> > > + .genpd_names = generic_genpd_names,
> >
> > This forces that every Kryo SoC has "perf" genpd, which obviously isn't
> > corret (at least from the upstream support point of view).
>
> While trying to get the above backtrace, randomly during boot
> I see the following BUG too.
This isn't a response to my comment.
>
> [ 1.562847] ------------[ cut here ]------------
> [ 1.574342] kernel BUG at drivers/cpufreq/cpufreq.c:1542!
> [ 1.579203] Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
> [ 1.579209] Modules linked in:
> [ 1.579217] CPU: 2 UID: 0 PID: 11 Comm: kworker/u16:0 Not tainted 6.10.0-rc4-next-20240620-00020-g125eb3184fc1-dirty #10
> [ 1.579227] Hardware name: Qualcomm Technologies, Inc. IPQ9574/AP-AL02-C7 (DT)
> [ 1.579232] Workqueue: events_unbound deferred_probe_work_func
> [ 1.579249] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 1.579257] pc : cpufreq_online+0x938/0x954
> [ 1.579271] lr : cpufreq_online+0x788/0x954
> [ 1.579281] sp : ffff8000817c3520
> [ 1.579283] x29: ffff8000817c3520 x28: ffff0000029efa50 x27: 0000000000000001
> [ 1.579294] x26: 0000000000000001 x25: ffff8000814d8da0 x24: 0000000000000000
> [ 1.579303] x23: ffff0000029ef9d0 x22: ffff800081735000 x21: 0000000000000000
> [ 1.579312] x20: 00000000000c15c0 x19: ffff0000029ef800 x18: ffff00000183481c
> [ 1.579321] x17: ffff8000818a3638 x16: 0000000000000000 x15: ffff8000818a3670
> [ 1.579330] x14: 0000000000000003 x13: ffff00000192b140 x12: ffff8000814d8c58
> [ 1.579338] x11: ffff00000192b140 x10: 00000000000009b0 x9 : ffff8000817c3240
> [ 1.579347] x8 : ffff00000192bad0 x7 : 0000000000000001 x6 : ffff8000814d8da0
> [ 1.579355] x5 : ffff8000812c32d0 x4 : 0000000000000000 x3 : 0000000000000000
> [ 1.579363] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 00000000fffffff0
> [ 1.579372] Call trace:
> [ 1.579375] cpufreq_online+0x938/0x954
> [ 1.579386] cpufreq_add_dev+0x80/0x98
> [ 1.579395] subsys_interface_register+0x100/0x130
> [ 1.579404] cpufreq_register_driver+0x150/0x244
> [ 1.579413] dt_cpufreq_probe+0x8c/0x440
> [ 1.579420] platform_probe+0x68/0xc8
> [ 1.579430] really_probe+0xbc/0x29c
> [ 1.579438] __driver_probe_device+0x78/0x12c
> [ 1.579446] driver_probe_device+0xd8/0x15c
> [ 1.579454] __device_attach_driver+0xb8/0x134
> [ 1.579463] bus_for_each_drv+0x84/0xe0
> [ 1.579470] __device_attach+0x9c/0x188
> [ 1.579478] device_initial_probe+0x14/0x20
> [ 1.579487] bus_probe_device+0xac/0xb0
> [ 1.579494] device_add+0x55c/0x720
> [ 1.579500] platform_device_add+0x1b8/0x244
> [ 1.579510] platform_device_register_full+0xfc/0x184
> [ 1.579516] platform_device_register_resndata.constprop.0+0x5c/0x8c
> [ 1.579524] qcom_cpufreq_probe+0x1e4/0x498
> [ 1.579531] platform_probe+0x68/0xc8
> [ 1.579540] really_probe+0xbc/0x29c
> [ 1.579548] __driver_probe_device+0x78/0x12c
> [ 1.579556] driver_probe_device+0xd8/0x15c
> [ 1.579564] __device_attach_driver+0xb8/0x134
> [ 1.579573] bus_for_each_drv+0x84/0xe0
> [ 1.579580] __device_attach+0x9c/0x188
> [ 1.579588] device_initial_probe+0x14/0x20
> [ 1.579596] bus_probe_device+0xac/0xb0
> [ 1.579603] deferred_probe_work_func+0x88/0xc0
> [ 1.579611] process_one_work+0x148/0x28c
> [ 1.579623] worker_thread+0x2e8/0x3f8
> [ 1.579633] kthread+0x110/0x114
> [ 1.579641] ret_from_fork+0x10/0x20
> [ 1.579653] Code: aa1703e0 52800021 97e38ed5 17ffffea (d4210000)
> [ 1.579657] ---[ end trace 0000000000000000 ]---
> [ 1.851078] note: kworker/u16:0[11] exited with irqs disabled
> [ 1.855791] note: kworker/u16:0[11] exited with preempt_count 1
> [ 1.861586] ------------[ cut here ]------------
>
> Randomly, the following call seems to return -EBUSY causing
> the above BUG().
>
> ret = __cpufreq_driver_target(policy, old_freq - 1,
> CPUFREQ_RELATION_L);
>
> /*
> * Reaching here after boot in a few seconds may not
> * mean that system will remain stable at "unknown"
> * frequency for longer duration. Hence, a BUG_ON().
> */
> BUG_ON(ret);
>
> Not sure why this does not happen in every boot, and how it
> is tied to genpd_names. Will debug and update.
Thanks!
--
With best wishes
Dmitry
Powered by blists - more mailing lists