[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b05e9c76-c0ed-9ecb-8225-9504e226677b@linaro.org>
Date: Mon, 12 Jul 2021 21:18:25 -0400
From: Thara Gopinath <thara.gopinath@...aro.org>
To: Viresh Kumar <viresh.kumar@...aro.org>
Cc: agross@...nel.org, bjorn.andersson@...aro.org, rui.zhang@...el.com,
daniel.lezcano@...aro.org, rjw@...ysocki.net, robh+dt@...nel.org,
tdas@...eaurora.org, mka@...omium.org,
linux-arm-msm@...r.kernel.org, linux-pm@...r.kernel.org,
linux-kernel@...r.kernel.org, devicetree@...r.kernel.org
Subject: Re: [Patch v3 3/6] cpufreq: qcom-cpufreq-hw: Add dcvs interrupt
support
On 7/12/21 12:41 AM, Viresh Kumar wrote:
> On 09-07-21, 11:37, Thara Gopinath wrote:
>> On 7/9/21 2:46 AM, Viresh Kumar wrote:
>>>> @@ -389,6 +503,10 @@ static int qcom_cpufreq_hw_cpu_exit(struct cpufreq_policy *policy)
>>>> dev_pm_opp_remove_all_dynamic(cpu_dev);
>>>> dev_pm_opp_of_cpumask_remove_table(policy->related_cpus);
>>>> + if (data->lmh_dcvs_irq > 0) {
>>>> + devm_free_irq(cpu_dev, data->lmh_dcvs_irq, data);
>>>
>>> Why using devm variants here and while requesting the irq ?
>
> Missed this one ?
Yep. I just replied to Bjorn's email on this. I will move to non devm
version.
>
>>>
>>>> + cancel_delayed_work_sync(&data->lmh_dcvs_poll_work);
>>>> + }
>>>
>>> Please move this to qcom_cpufreq_hw_lmh_exit() or something.
>>
>> Ok.
>>
>>>
>>> Now with sequence of disabling interrupt, etc, I see a potential
>>> problem.
>>>
>>> CPU0 CPU1
>>>
>>> qcom_cpufreq_hw_cpu_exit()
>>> -> devm_free_irq();
>>> qcom_lmh_dcvs_poll()
>>> -> qcom_lmh_dcvs_notify()
>>> -> enable_irq()
>>>
>>> -> cancel_delayed_work_sync();
>>>
>>>
>>> What will happen if enable_irq() gets called after freeing the irq ?
>>> Not sure, but it looks like you will hit this then from manage.c:
>>>
>>> WARN(!desc->irq_data.chip, KERN_ERR "enable_irq before
>>> setup/request_irq: irq %u\n", irq))
>>>
>>> ?
>>>
>>> You got a chicken n egg problem :)
>>
>> Yes indeed! But also it is a very rare chicken and egg problem.
>> The scenario here is that the cpus are busy and running load causing a
>> thermal overrun and lmh is engaged. At the same time for this issue to be
>> hit the cpu is trying to exit/disable cpufreq.
>
> Yes, it is a very specific case but it needs to be resolved anyway. You don't
> want to get this ever :)
>
>> Calling
>> cancel_delayed_work_sync first could solve this issue, right ?
>> cancel_delayed_work_sync guarantees the work not to be pending even if
>> it requeues itself on return. So once the delayed work is cancelled, the
>> interrupts can be safely disabled. Thoughts ?
>
> I don't think even that would provide such guarantees to you here, as there is
> a chance the work gets queued again because of an interrupt that triggers right
> after you cancel the work.
>
> The basic way of solving such issues is that once you cancel something, you need
> to guarantee that it doesn't get triggered again, no matter what.
>
> The problem here I see is with your design itself, both delayed work and irq can
> enable each other, so no matter which one you disable first, won't be
> sufficient. You need to fix that design somehow.
So I really need the interrupt to fire and then the timer to kick in and
take up the monitoring. I can think of introducing a variable
is_disabled which is updated and read under a spinlock.
qcom_cpufreq_hw_cpu_exit can hold the spinlock and set is_disabled to
true prior to cancelling the work queue or disabling the interrupt.
Before re-enabling the interrupt or re-queuing the work in
qcom_lmh_dcvs_notify, is_disabled can be read and checked.
But does this problem not exist in target_index , fast_switch etc also ?
One cpu can be disabling and the other one can be updating the target
right?
>
--
Warm Regards
Thara (She/Her/Hers)
Powered by blists - more mailing lists