linux-kernel - Re: [PATCH] sched/fair: Do not decay new task load on first enqueue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f2091da3-b96e-d26c-8db7-a1db2d9237ae@arm.com>
Date:   Mon, 10 Oct 2016 14:54:41 +0100
From:   Dietmar Eggemann <dietmar.eggemann@....com>
To:     Vincent Guittot <vincent.guittot@...aro.org>,
        Matt Fleming <matt@...eblueprint.co.uk>
Cc:     Wanpeng Li <kernellwp@...il.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Mike Galbraith <umgwanakikbuti@...il.com>,
        Yuyang Du <yuyang.du@...el.com>
Subject: Re: [PATCH] sched/fair: Do not decay new task load on first enqueue

On 10/10/16 13:29, Vincent Guittot wrote:
> On 10 October 2016 at 12:01, Matt Fleming <matt@...eblueprint.co.uk> wrote:
>> On Sun, 09 Oct, at 11:39:27AM, Wanpeng Li wrote:
>>>
>>> The difference between this patch and Peterz's is your patch have a
>>> delta since activate_task()->enqueue_task() does do update_rq_clock(),
>>> so why don't have the delta will cause low cpu machines (4 or 8) to
>>> regress against your another reply in this thread?
>>
>> Both my patch and Peter's patch cause issues with low cpu machines. In
>> <20161004201105.GP16071@...eblueprint.co.uk> I said,
>>
>>  "This patch causes some low cpu machines (4 or 8) to regress. It turns
>>   out they regress with my patch too."
>>
>> Have I misunderstood your question?
>>
>> I ran out of time to investigate this last week, though I did try all
>> proposed patches, including Vincent's, and none of them produced wins
>> across the board.
> 
> I have tried to reprocude your issue on my target an hikey board (ARM
> based octo cores) but i failed to see a regression with commit
> 7dc603c9028e. Neverthless, i can see tasks not been well  spread

Wasn't this about the two patches mentioned in this thread? The one from
Matt using 'se->sum_exec_runtime' in the if condition in
enqueue_entity_load_avg() and Peterz's conditional call to
update_rq_clock(rq) in enqueue_task()?

> during fork as you mentioned. So I have studied a bit more the
> spreading issue during fork last week and i have a new version of my
> proposed patch that i'm going to send soon. With this patch, i can see
> a good spread of tasks  during the fork sequence and some kind of perf
> improvement even if it's bit difficult as the variance is quite
> important with hackbench test so it's mainly an improvement of
> repeatability of the result

Hikey  (ARM64 2x4 cpus) board: cpufreq: performance, cpuidle: disabled

Performance counter stats for 'perf bench sched messaging -g 20 -l 500'
(10 runs):

(1) tip/sched/core: commit 447976ef4fd0

    5.902209533 seconds time elapsed ( +- 0.31% )

(2) tip/sched/core + original patch on the 'sched/fair: Do not decay
    new task load on first enqueue' thread (23/09/16)

    5.919933030 seconds time elapsed ( +- 0.44% )

(3) tip/sched/core + Peter's ENQUEUE_NEW patch on the 'sched/fair: Do
    not decay new task load on first enqueue' thread (28/09/16)

    5.970195534 seconds time elapsed ( +- 0.37% )

Not sure if we can call this a regression but it also shows no
performance gain.

>>
>> I should get a bit further this week.
>>
>> Vincent, Dietmar, did you guys ever get around to submitting your PELT
>> tracepoint patches? Getting some introspection into the scheduler's
> 
> My tarcepoint are not in a shape to be submitted and would need a
> cleanup as some are more hacks for debugging than real trace events.
> Nevertheless, i can push them on a git branch if they can be useful
> for someone

We carry two trace events locally, one for PELT on se and one for
cfs_rq's (I have to add the runnable bits here) which work for
CONFIG_FAIR_GROUP_SCHED and !CONFIG_FAIR_GROUP_SCHED. I put them into
__update_load_avg(), attach_entity_load_avg() and
detach_entity_load_avg(). I could post them but so far mainline has been
reluctant to see the need for PELT related trace events ...

[...]