linux-kernel - Re: [Patch v3 3/6] cpufreq: qcom-cpufreq-hw: Add dcvs interrupt support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b05e9c76-c0ed-9ecb-8225-9504e226677b@linaro.org>
Date:   Mon, 12 Jul 2021 21:18:25 -0400
From:   Thara Gopinath <thara.gopinath@...aro.org>
To:     Viresh Kumar <viresh.kumar@...aro.org>
Cc:     agross@...nel.org, bjorn.andersson@...aro.org, rui.zhang@...el.com,
        daniel.lezcano@...aro.org, rjw@...ysocki.net, robh+dt@...nel.org,
        tdas@...eaurora.org, mka@...omium.org,
        linux-arm-msm@...r.kernel.org, linux-pm@...r.kernel.org,
        linux-kernel@...r.kernel.org, devicetree@...r.kernel.org
Subject: Re: [Patch v3 3/6] cpufreq: qcom-cpufreq-hw: Add dcvs interrupt
 support



On 7/12/21 12:41 AM, Viresh Kumar wrote:
> On 09-07-21, 11:37, Thara Gopinath wrote:
>> On 7/9/21 2:46 AM, Viresh Kumar wrote:
>>>> @@ -389,6 +503,10 @@ static int qcom_cpufreq_hw_cpu_exit(struct cpufreq_policy *policy)
>>>>    	dev_pm_opp_remove_all_dynamic(cpu_dev);
>>>>    	dev_pm_opp_of_cpumask_remove_table(policy->related_cpus);
>>>> +	if (data->lmh_dcvs_irq > 0) {
>>>> +		devm_free_irq(cpu_dev, data->lmh_dcvs_irq, data);
>>>
>>> Why using devm variants here and while requesting the irq ?
> 
> Missed this one ?

Yep. I just replied to Bjorn's email on this. I will move to non devm 
version.

> 
>>>
>>>> +		cancel_delayed_work_sync(&data->lmh_dcvs_poll_work);
>>>> +	}
>>>
>>> Please move this to qcom_cpufreq_hw_lmh_exit() or something.
>>
>> Ok.
>>
>>>
>>> Now with sequence of disabling interrupt, etc, I see a potential
>>> problem.
>>>
>>> CPU0                                    CPU1
>>>
>>> qcom_cpufreq_hw_cpu_exit()
>>> -> devm_free_irq();
>>>                                           qcom_lmh_dcvs_poll()
>>>                                           -> qcom_lmh_dcvs_notify()
>>>                                             -> enable_irq()
>>>
>>> -> cancel_delayed_work_sync();
>>>
>>>
>>> What will happen if enable_irq() gets called after freeing the irq ?
>>> Not sure, but it looks like you will hit this then from manage.c:
>>>
>>> WARN(!desc->irq_data.chip, KERN_ERR "enable_irq before
>>>                                        setup/request_irq: irq %u\n", irq))
>>>
>>> ?
>>>
>>> You got a chicken n egg problem :)
>>
>> Yes indeed! But also it is a very rare chicken and egg problem.
>> The scenario here is that the cpus are busy and running load causing a
>> thermal overrun and lmh is engaged. At the same time for this issue to be
>> hit the cpu is trying to exit/disable cpufreq.
> 
> Yes, it is a very specific case but it needs to be resolved anyway. You don't
> want to get this ever :)
> 
>> Calling
>> cancel_delayed_work_sync first could solve this issue, right ?
>> cancel_delayed_work_sync guarantees the work not to be pending even if
>> it requeues itself on return. So once the delayed work is cancelled, the
>> interrupts can be safely disabled. Thoughts ?
> 
> I don't think even that would provide such guarantees to you here, as there is
> a chance the work gets queued again because of an interrupt that triggers right
> after you cancel the work.
> 
> The basic way of solving such issues is that once you cancel something, you need
> to guarantee that it doesn't get triggered again, no matter what.
> 
> The problem here I see is with your design itself, both delayed work and irq can
> enable each other, so no matter which one you disable first, won't be
> sufficient. You need to fix that design somehow.

So I really need the interrupt to fire and then the timer to kick in and 
take up the monitoring. I can think of introducing a variable 
is_disabled which is updated and read under a spinlock. 
qcom_cpufreq_hw_cpu_exit can hold the spinlock and set is_disabled to 
true prior to cancelling the work queue or disabling the interrupt. 
Before re-enabling the interrupt or re-queuing the work in 
qcom_lmh_dcvs_notify, is_disabled can be read and checked.

But does this problem not exist in target_index , fast_switch etc also ? 
One cpu can be disabling and the other one can be updating the target 
right?

> 

-- 
Warm Regards
Thara (She/Her/Hers)