lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 29 Jun 2021 09:47:37 +0100
From:   Ionela Voinescu <ionela.voinescu@....com>
To:     Viresh Kumar <viresh.kumar@...aro.org>
Cc:     Rafael Wysocki <rjw@...ysocki.net>,
        Sudeep Holla <sudeep.holla@....com>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        linux-pm@...r.kernel.org, Qian Cai <quic_qiancai@...cinc.com>,
        linux-acpi@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH V3 4/4] cpufreq: CPPC: Add support for frequency
 invariance

Hey,

On Tuesday 29 Jun 2021 at 10:02:44 (+0530), Viresh Kumar wrote:
> On 28-06-21, 11:49, Ionela Voinescu wrote:
> > To be honest I would like to have more time on this before you merge the
> > set, to better understand Qian's results and some observations I have
> > for Thunder X2 (I will share in a bit).
> 
> Ideally, this code was already merged in 5.13 and would have required
> us to fix any problems as we encounter them. I did revert it because
> it caused a kernel crash and I wasn't sure if there was a sane/easy
> way of fixing that so late in the release cycle. That was the right
> thing to do then.
> 
> All those issues are gone now, we may have an issue around rounding of
> counters or some hardware specific issues, it isn't clear yet.
> 
> But the stuff works fine otherwise, doesn't make the kernel crash and
> it is controlled with a CONFIG_ option, so those who don't want to use
> it can still disable it.
> 
> The merge window is here now, if we don't merge it now, it gets
> delayed by a full cycle (roughly two months) and if we merge it now
> and are able to narrow down the rounding issues, if there are any, we
> will have full two months to make a fix for that and still push it in
> 5.14 itself.
> 
> And so I would like to get it merged in this merge window itself, it
> also makes sure more people would get to test it, like Qian was able
> to figure out a problem here for us.
> 

Okay, makes sense. I have not seen this code actually do anything wrong
so far, and the issues I see on ThunderX2 point more to misbehaving
counters for this purpose. This being said, I would have probably
preferred for this feature to be disabled by default, until we've tested
more, but that won't give the chance to anyone else to test.

> > For the code, I think it's fine. I have a single observation regarding
> > the following code:
> > 
> > > +static void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy)
> > > +{
> > > +	struct cppc_freq_invariance *cppc_fi;
> > > +	int cpu, ret;
> > > +
> > > +	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> > > +		return;
> > > +
> > > +	for_each_cpu(cpu, policy->cpus) {
> > > +		cppc_fi = &per_cpu(cppc_freq_inv, cpu);
> > > +		cppc_fi->cpu = cpu;
> > > +		cppc_fi->cpu_data = policy->driver_data;
> > > +		kthread_init_work(&cppc_fi->work, cppc_scale_freq_workfn);
> > > +		init_irq_work(&cppc_fi->irq_work, cppc_irq_work);
> > > +
> > > +		ret = cppc_get_perf_ctrs(cpu, &cppc_fi->prev_perf_fb_ctrs);
> > > +		if (ret) {
> > > +			pr_warn("%s: failed to read perf counters for cpu:%d: %d\n",
> > > +				__func__, cpu, ret);
> > > +			return;
> > > +		}
> > 
> > For this condition above, think about a scenario where reading counters
> > for offline CPUs returns an error. I'm not sure if that can happen, to
> > be honest. That would mean here that you will never initialise the freq
> > source unless all CPUs in the policy are online at policy creation.
> > 
> > My recommendation is to warn about the failed read of perf counters but
> > only return from this function if the target CPU is online as well when
> > reading counters fails.
> > 
> > This is probably a nit, so I'll let you decide if you want to do something
> > about this.
> 
> That is a very good observation actually. Thanks for that. This is how
> I fixed it.
> 
> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
> index d688877e8fbe..f6540068d0fe 100644
> --- a/drivers/cpufreq/cppc_cpufreq.c
> +++ b/drivers/cpufreq/cppc_cpufreq.c
> @@ -171,7 +171,13 @@ static void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy)
>                 if (ret) {
>                         pr_warn("%s: failed to read perf counters for cpu:%d: %d\n",
>                                 __func__, cpu, ret);
> -                       return;
> +
> +                       /*
> +                        * Don't abort if the CPU was offline while the driver
> +                        * was getting registered.
> +                        */
> +                       if (cpu_online(cpu))
> +                               return;
>                 }
>         }
> 
> -- 

Thanks!

Reviewed-by: Ionela Voinescu <ionela.voinescu@....com>

Ionela.

> viresh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ