[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210617103415.GA29877@arm.com>
Date: Thu, 17 Jun 2021 11:34:15 +0100
From: Ionela Voinescu <ionela.voinescu@....com>
To: Viresh Kumar <viresh.kumar@...aro.org>
Cc: Rafael Wysocki <rjw@...ysocki.net>,
Sudeep Holla <sudeep.holla@....com>,
Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
linux-pm@...r.kernel.org, Qian Cai <quic_qiancai@...cinc.com>,
linux-acpi@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH V2 3/3] cpufreq: CPPC: Add support for frequency
invariance
Many thanks for the details!
On Thursday 17 Jun 2021 at 08:54:16 (+0530), Viresh Kumar wrote:
> On 16-06-21, 13:48, Ionela Voinescu wrote:
> > I was looking forward to the complete removal of stop_cpu() :).
>
> No one wants to remove more code than I do :)
>
> > I'll only comment on this for now as I should know the rest.
> >
> > Let's assume we don't have these, what happens now is the following:
> >
> > 1. We hotplug out the last CPU in a policy, we call the
> > .stop_cpu()/exit() function which will free the cppc_cpudata structure.
> >
> > The only vulnerability is if we have a last tick on that last CPU,
> > after the above callback was called.
> >
> > 2. When the CPU at 1. gets hotplugged back in, the cppc_fi->cpu_data is
> > stale.
> >
> > We do not have a problem when removing the CPPC cpufreq module as we're
> > doing cppc_freq_invariance_exit() before unregistering the driver and
> > freeing the data.
> >
> > Are 1. and 2 the only problems we have, or have I missed any?
>
> There is more to it. For simplicity, lets assume a quad-core setup,
> with all 4 CPUs sharing the cpufreq policy. And here is what happens
> without the new changes:
>
> - On CPPC cpufreq driver insertion, we start 4 kthreads/irq-works (1
> for each CPU as it fires from tick) from the ->init() callback.
>
> - Now lets say we offline CPU3. The CPPC cpufreq driver isn't notified
> by anyone and it hasn't registered itself to hotplug notifier as
> well. So, the irq-work/kthread isn't stopped. This results in the
> issue reported by Qian earlier.
>
I might be missing something, but when you offline a single CPU in a
policy, the worse that can happen is that a last call to
cppc_scale_freq_tick() would have sneaked in before irqs and the tick
are disabled. But even if we have a last call to
cppc_scale_freq_workfn(), the counter read methods would know how to
cope with hotplug, and the cppc_cpudata structure would still be
allocated and have valid desired_perf and highest_perf values.
Worse case, the last scale factor set for the CPU will be meaningless,
but it's already meaningless as the CPU is going down.
When you are referring to the issue reported by Qian I suppose you are
referring to this [1]. I think this is the case where you hotplug the
last CPU in a policy and free cppc_cpudata.
[1] https://lore.kernel.org/linux-pm/41f5195e-0e5f-fdfe-ba37-34e1fd8e4064@quicinc.com/
> The very same thing happens with schedutil governor too, which uses
> very similar mechanisms, and the cpufreq core takes care of it there
> by stopping the governor before removing the CPU from policy->cpus
> and starting it again. So there we stop irq-work/kthread for all 4
> CPUs, then start them only for remaining 3.
>
Things are different for sugov: you have to stop the governor when one
CPU in the policy goes down, and it's normal for sugov not to allow its
hooks to be called while the governor is down so it has to do a full
cleanup when going down and a full bringup when going back up.
The difference for CPPC invariance is that only a CPU can trigger the
work to update its own scale factor, through the tick. No other CPU x
can trigger a scale factor update for CPU y, but x can carry out the
work for CPU y (x can run the cppc_scale_freq_workfn(y)).
So when y goes down, it won't have a tick to queue any irq or kthread
work any longer until it's brought back up. So I believe that the only
cleanup that needs to be done when a CPU goes offline, is to ensure
that the work triggered by that last tick is done safely.
> I thought about that approach as well, but it was too heavy to stop
> everyone and start again in this case. And so I invented start_cpu()
> and stop_cpu() callbacks.
>
> - In this case, because the CPU is going away, we need to make sure we
> don't queue any more irq-work or kthread to it and this is one of
> the main reasons for adding synchronization in the topology core,
> because we need a hard guarantee here that irq-work won't fire
> again, as the CPU won't be there or will not be in a sane state.
>
We can't queue any more work for it as there's no tick for the offline
CPU.
> - The same sequence of events is true for the case where the last CPU
> of a policy goes away (not in this example, but lets say quad-core
> setup with separate policies for each CPU).
>
> - Not just the policy, but the CPU may be going away as well.
>
> I hope I was able to clarify a bit here.
>
Thanks! I do agree it is better to be cautious, but I initially wanted to
understand we don't see the problem bigger than it actually is.
Thanks,
Ionela.
P.S. I'll give more thought to the rcu use in the arch_topology driver.
I'm the boring kind that likes to err on the side of caution, so I tend
to agree that it might be good to have even if the current problem could
be solved in this driver.
> --
> viresh
Powered by blists - more mailing lists