linux-kernel - Re: [PATCH 1/2] sched/fair: move cpufreq hook to update_cfs_rq_load

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJZ5v0gRaP9s_B6ttEpBSqBcGRX-j+kNetyaKo07C2XpY81EEQ@mail.gmail.com>
Date:	Wed, 13 Apr 2016 16:45:56 +0200
From:	"Rafael J. Wysocki" <rafael@...nel.org>
To:	Steve Muckle <steve.muckle@...aro.org>
Cc:	"Rafael J. Wysocki" <rafael@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Ingo Molnar <mingo@...hat.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	"linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Morten Rasmussen <morten.rasmussen@....com>,
	Juri Lelli <Juri.Lelli@....com>,
	Patrick Bellasi <patrick.bellasi@....com>,
	Michael Turquette <mturquette@...libre.com>
Subject: Re: [PATCH 1/2] sched/fair: move cpufreq hook to update_cfs_rq_load_avg()

On Tue, Apr 12, 2016 at 9:38 PM, Steve Muckle <steve.muckle@...aro.org> wrote:
> On Tue, Apr 12, 2016 at 04:29:06PM +0200, Rafael J. Wysocki wrote:
>> On Mon, Apr 11, 2016 at 11:20 PM, Rafael J. Wysocki <rafael@...nel.org> wrote:
>> > On Mon, Apr 11, 2016 at 9:28 PM, Steve Muckle <steve.muckle@...aro.org> wrote:
>> >> Hi Rafael,
>> >>
>> >> On 04/01/2016 02:20 AM, Peter Zijlstra wrote:
>> >>>> > My thinking was in CFS we get rid of the (cpu == smp_processor_id())
>> >>>> > condition for calling the cpufreq hook.
>> >>>> >
>> >>>> > The sched governor can then calculate utilization and frequency required
>> >>>> > for cpu. If (cpu == smp_processor_id()), the update is processed
>> >>>> > normally. If (cpu != smp_processor_id()) and the new frequency is higher
>> >>>> > than cpu's Fcur, the sched gov IPIs cpu to continue running the update
>> >>>> > operation. Otherwise, the update is dropped.
>> >>>> >
>> >>>> > Does that sound plausible?
>> >>>
>> >>> Can be done I suppose..
>> >>
>> >> Currently we drop schedutil updates for a target CPU which do not occur
>> >> on that CPU.
>> >>
>> >> Is this solely due to platforms which must run the cpufreq driver on the
>> >> target CPU?
>> >
>> > The current code assumes that the CPU running the update will always
>> > be the one that gets updated.  Anything else would require extra
>> > synchronization.
>>
>> This is rather fundamental.
>>
>> For example, if you look at cpufreq_update_util(), it does this:
>>
>> data = rcu_dereference_sched(*this_cpu_ptr(&cpufreq_update_util_data));
>>
>> meaning that it will run the current CPU's utilization update
>> callback.  Of course, that won't work cross-CPU, because in principle
>> different CPUs may use different governors and therefore different
>> util update callbacks.
>>
>> If you want to do remote updates, I guess that will require an
>> irq_work to run the update on the target CPU, but then you'll probably
>> want to neglect the rate limit on it as well, so it looks like a
>> "need_update" flag in struct update_util_data will be useful for that.
>>
>> I think I can prototype something along these lines, but can you
>> please tell me more about the case you have in mind?
>
> I'm concerned generally with the latency to react to changes in
> required capacity due to remote wakeups, which are quite common on SMP
> platforms with shared cache. Unless the hook is called it could take
> up to a tick to react AFAICS if the target CPU is running some other
> task that does not get preempted by the wakeup.

So the scenario seems to be that CPU A is running task X and CPU B
wakes up task Y on it remotely, but that task has to wait for CPU A to
get to it, so you want to increase the frequency of CPU A at the
wakeup time so as to reduce the time the woken up task has to wait.

In that case task X would not be giving the CPU away (ie. no
invocations of schedule()) for the whole tick, so it would be
CPU/memory bound.  In that case I would expect CPU A to be running at
full capacity already unless this is the first tick period in which
task X behaves this way which looks like a corner case to me.

Moreover, sending an IPI to CPU A in that case looks like the right
thing to do to me anyway.

Thanks,
Rafael