linux-kernel - Re: [Patch v3 3/6] cpufreq: qcom-cpufreq-hw: Add dcvs interrupt support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210712044112.svhlagrktcfvyj35@vireshk-i7>
Date:   Mon, 12 Jul 2021 10:11:12 +0530
From:   Viresh Kumar <viresh.kumar@...aro.org>
To:     Thara Gopinath <thara.gopinath@...aro.org>
Cc:     agross@...nel.org, bjorn.andersson@...aro.org, rui.zhang@...el.com,
        daniel.lezcano@...aro.org, rjw@...ysocki.net, robh+dt@...nel.org,
        tdas@...eaurora.org, mka@...omium.org,
        linux-arm-msm@...r.kernel.org, linux-pm@...r.kernel.org,
        linux-kernel@...r.kernel.org, devicetree@...r.kernel.org
Subject: Re: [Patch v3 3/6] cpufreq: qcom-cpufreq-hw: Add dcvs interrupt
 support

On 09-07-21, 11:37, Thara Gopinath wrote:
> On 7/9/21 2:46 AM, Viresh Kumar wrote:
> > > @@ -389,6 +503,10 @@ static int qcom_cpufreq_hw_cpu_exit(struct cpufreq_policy *policy)
> > >   	dev_pm_opp_remove_all_dynamic(cpu_dev);
> > >   	dev_pm_opp_of_cpumask_remove_table(policy->related_cpus);
> > > +	if (data->lmh_dcvs_irq > 0) {
> > > +		devm_free_irq(cpu_dev, data->lmh_dcvs_irq, data);
> > 
> > Why using devm variants here and while requesting the irq ?

Missed this one ?

> > 
> > > +		cancel_delayed_work_sync(&data->lmh_dcvs_poll_work);
> > > +	}
> > 
> > Please move this to qcom_cpufreq_hw_lmh_exit() or something.
> 
> Ok.
> 
> > 
> > Now with sequence of disabling interrupt, etc, I see a potential
> > problem.
> > 
> > CPU0                                    CPU1
> > 
> > qcom_cpufreq_hw_cpu_exit()
> > -> devm_free_irq();
> >                                          qcom_lmh_dcvs_poll()
> >                                          -> qcom_lmh_dcvs_notify()
> >                                            -> enable_irq()
> > 
> > -> cancel_delayed_work_sync();
> > 
> > 
> > What will happen if enable_irq() gets called after freeing the irq ?
> > Not sure, but it looks like you will hit this then from manage.c:
> > 
> > WARN(!desc->irq_data.chip, KERN_ERR "enable_irq before
> >                                       setup/request_irq: irq %u\n", irq))
> > 
> > ?
> > 
> > You got a chicken n egg problem :)
> 
> Yes indeed! But also it is a very rare chicken and egg problem.
> The scenario here is that the cpus are busy and running load causing a
> thermal overrun and lmh is engaged. At the same time for this issue to be
> hit the cpu is trying to exit/disable cpufreq.

Yes, it is a very specific case but it needs to be resolved anyway. You don't
want to get this ever :)

> Calling
> cancel_delayed_work_sync first could solve this issue, right ?
> cancel_delayed_work_sync guarantees the work not to be pending even if
> it requeues itself on return. So once the delayed work is cancelled, the
> interrupts can be safely disabled. Thoughts ?

I don't think even that would provide such guarantees to you here, as there is
a chance the work gets queued again because of an interrupt that triggers right
after you cancel the work.

The basic way of solving such issues is that once you cancel something, you need
to guarantee that it doesn't get triggered again, no matter what.

The problem here I see is with your design itself, both delayed work and irq can
enable each other, so no matter which one you disable first, won't be
sufficient. You need to fix that design somehow.

-- 
viresh