linux-kernel - Re: [PATCH 03/11] sched: Extend scheduler's asym packing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160826103945.GC1323@e105550-lin.cambridge.arm.com>
Date:   Fri, 26 Aug 2016 11:39:46 +0100
From:   Morten Rasmussen <morten.rasmussen@....com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
        mingo@...hat.com, tglx@...utronix.de, hpa@...or.com,
        rjw@...ysocki.net, x86@...nel.org, bp@...e.de,
        sudeep.holla@....com, ak@...ux.intel.com,
        linux-acpi@...r.kernel.org, linux-pm@...r.kernel.org,
        alexey.klimov@....com, viresh.kumar@...aro.org,
        akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
        lenb@...nel.org, tim.c.chen@...ux.intel.com,
        paul.gortmaker@...driver.com, jpoimboe@...hat.com,
        mcgrof@...nel.org, jgross@...e.com, robert.moore@...el.com,
        dvyukov@...gle.com, jeyu@...hat.com
Subject: Re: [PATCH 03/11] sched: Extend scheduler's asym packing

On Thu, Aug 25, 2016 at 03:45:03PM +0200, Peter Zijlstra wrote:
> On Thu, Aug 25, 2016 at 02:18:37PM +0100, Morten Rasmussen wrote:
> 
> > But why not just pass the customized list into the scheduler? Seems
> > simpler?
> 
> Mostly because I didn't want to regress Power I suppose. The ITMT stuff
> needs an extra load, whereas the Power stuff can use the CPU number we
> already have.

The customized list wouldn't have to be mandatory. You could easily
create a default list that would match current behaviour for Power.

To pass in a custom list of priorities you could either extend struct
sched_domain_topology_level to have another function pointer that
returns the cpu priority, or introduce an arch_cpu_priotity() function.
Either of them could be used in the sched_domain hierarchy to set the
sched_group priority cpu and if you add a rq->cpu_priority, the
asymmetric packing comparison would be a simple comparison between
rq->cpu_priority of the two cpus in question.

What is the 'extra load' needed for ITMT? Isn't it just a priority list,
or does the absolute priority value have a meaning? I only saw it used
for less_than comparison, maybe I missed it.

If you need to express the difference in compute capability, why not use
capacity?

> Also, since we need an interface to pass in this custom list, I don't
> see the distinction, you can do the same manipulation by constantly
> updating the prio list.

Sure, but the overhead of rebuilding the sched_domain hierarchy is huge
compared to just tweaking the result of the less_than operator that get
called from the scheduler frequently. However, updating
group_priority_cpu() would require a rebuild too in this patch set.

> But not of this stuff should be EXPORT'ed, so its only available to the
> core kernel, which greatly limits the potential for abuse. We can see
> arch code just fine.

I don't see why it can't be wired up to be controlled by entities
outside arch code, e.g. cpufreq or the thermal framework, or even code
outside the kernel (firmware).

> And if you spin a custom kernel, you can already wreck the load
> balancer.

You can wreck any software where you have the source code and a compiler
:)