linux-kernel - Re: [PATCH v6 7/7][Resend] cpufreq: schedutil: New governor based on scheduler utilization data

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1569749.Ubc4LvLySG@vostro.rjw.lan>
Date:	Tue, 29 Mar 2016 14:23:21 +0200
From:	"Rafael J. Wysocki" <rjw@...ysocki.net>
To:	Steve Muckle <steve.muckle@...aro.org>
Cc:	"Rafael J. Wysocki" <rafael@...nel.org>,
	Linux PM list <linux-pm@...r.kernel.org>,
	Juri Lelli <juri.lelli@....com>,
	ACPI Devel Maling List <linux-acpi@...r.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
	Viresh Kumar <viresh.kumar@...aro.org>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Michael Turquette <mturquette@...libre.com>,
	Ingo Molnar <mingo@...nel.org>
Subject: Re: [PATCH v6 7/7][Resend] cpufreq: schedutil: New governor based on scheduler utilization data

On Monday, March 28, 2016 11:17:44 AM Steve Muckle wrote:
> On 03/26/2016 06:36 PM, Rafael J. Wysocki wrote:
> >>>> +static int sugov_limits(struct cpufreq_policy *policy)
> >>>> >>> +{
> >>>> >>> +     struct sugov_policy *sg_policy = policy->governor_data;
> >>>> >>> +
> >>>> >>> +     if (!policy->fast_switch_enabled) {
> >>>> >>> +             mutex_lock(&sg_policy->work_lock);
> >>>> >>> +
> >>>> >>> +             if (policy->max < policy->cur)
> >>>> >>> +                     __cpufreq_driver_target(policy, policy->max,
> >>>> >>> +                                             CPUFREQ_RELATION_H);
> >>>> >>> +             else if (policy->min > policy->cur)
> >>>> >>> +                     __cpufreq_driver_target(policy, policy->min,
> >>>> >>> +                                             CPUFREQ_RELATION_L);
> >>>> >>> +
> >>>> >>> +             mutex_unlock(&sg_policy->work_lock);
> >>>> >>> +     }
> >>> >>
> >>> >> Is the expectation that in the fast_switch_enabled case we should
> >>> >> re-evaluate soon enough that an explicit fixup is not required here?
> >> >
> >> > Yes, it is.
> >> >
> >>> >> I'm worried as to whether that will always be true given the possible
> >>> >> criticality of applying frequency limits (thermal for example).
> >> >
> >> > The part of the patch below that you cut actually takes care of that:
> >> >
> >> >     sg_policy->need_freq_update = true;
> >> >
> >> > which causes the rate limit to be ignored essentially, so the
> >> > frequency will be changed on the first update from the scheduler.
> 
> The scenario I'm contemplating is that while a CPU-intensive task is
> running a thermal interrupt goes off. The driver for this thermal
> interrupt responds by capping fmax. If this happens just after the tick,
> it seems possible that we could wait a full tick before changing the
> frequency. Given a 10ms tick it could be rather annoying for thermal
> management algorithms on some platforms (I'm familiar with a few).

The thermal driver has to do something like cpufreq_update_policy() then
which can only happen in process context.  I'm not sure how it is possible
to guarantee any latency better than that full tick here anyway.

> >> > Which also is why the min/max check is before the sg_policy->next_freq
> >> > == next_freq check in sugov_update_commit().
> >> >
> >> > I wanted to avoid locking in the fast switch/one CPU per policy case
> >> > which otherwise would be necessary just for the handling of this
> >> > thing.  I'd like to keep it the way it is unless it can be clearly
> >> > demonstrated that it really would lead to problems in practice in a
> >> > real system.
> >
> > Besides, even if frequency is updated directly from here in the "fast
> > switch" case, that still doesn't guarantee that it will be updated
> > immediately, because the task running this code may be preempted and
> > only scheduled again in the next cycle.
> >
> > Not to mention the fact that it may not run on the CPU to be updated,
> > so it would need to use something like smp_call_function_single() for
> > the update and that would complicate things even more.
> > 
> > Overall, I don't really think that doing the update directly from here
> > in the "fast switch" case would improve things much latency-wise and
> > it would increase complexity and introduce overhead into the fast
> > path.  So this really is a tradeoff and the current choice is the
> > right one IMO.
> 
> On the desire to avoid locking in the fast switch/one CPU per policy
> case, I wondered about whether disabling interrupts in sugov_limits()
> would suffice. That's a rarely called function and I was hoping that the
> update hook would already have interrupts disabled due to its being
> called in scheduler paths that may do raw_spin_lock_irqsave. But I'm not
> sure offhand that will always be true.

It will.

That's why we can use RCU-sched in cpufreq_update_util() etc.

> If it isn't though then I'm not
> sure what's necessarily stopping say the sched tick calling the hook
> while the hook is already in progress from some other path.
> 
> Agreed there would need to be some additional complexity somewhere to
> get things running on the correct CPU.
> 
> Anyway I have nothing against deferring this for now.

OK

Thanks,
Rafael