linux-kernel - Re: [RFC PATCH v3 2/3] sched: Introduce cpus_share

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <73d2be82-b4da-f87e-a1e3-5c187a268e69@efficios.com>
Date:   Fri, 25 Aug 2023 09:51:19 -0400
From:   Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To:     Aaron Lu <aaron.lu@...el.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
        Valentin Schneider <vschneid@...hat.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Swapnil Sapkal <Swapnil.Sapkal@....com>,
        Julien Desfossez <jdesfossez@...italocean.com>, x86@...nel.org
Subject: Re: [RFC PATCH v3 2/3] sched: Introduce cpus_share_l2c

On 8/25/23 02:49, Aaron Lu wrote:
> On Thu, Aug 24, 2023 at 10:40:45AM -0400, Mathieu Desnoyers wrote:
[...]
>>> - task migrations dropped with this series for nr_group=20 and 32
>>>     according to 'perf stat'. migration number didn't drop for nr_group=10
>>>     but the two update functions' cost dropped which means fewer access to
>>>     tg->load_avg and thus, fewer task migrations. This is contradictory
>>>     and I can not explain yet;
>>
>> Neither can I.
>>

[...]

>>
>>> It's not clear to me why this series can reduce task migrations. I doubt
>>> it has something to do with more wakelist style wakeup becasue for this
>>> test machine, only a single core with two SMT threads share L2 so more
>>> wakeups are through wakelist. In wakelist style wakeup, the target rq's
>>> ttwu_pending is set and that will make the target cpu as !idle_cpu();
>>> This is faster than grabbing the target rq's lock and then increase
>>> target rq's nr_running or set target rq's curr to something else than
>>> idle. So wakelist style wakeup can make target cpu appear as non idle
>>> faster, but I can't connect this with reduced migration yet, I just feel
>>> this might be the reason why task migration reduced.
>>
> 
[...]
>> I've tried adding checks for rq->ttwu_pending in those code paths on top of
>> my patch and I'm still observing the reduction in number of migrations, so
>> it's unclear to me how doing more queued wakeups can reduce migrations the
>> way it does.
> 
> An interesting puzzle.

One metric that can help understand the impact of my patch: comparing
hackbench from a baseline where only your load_avg patch is applied
to a kernel with my l2c patch applied, I notice that the goidle
schedstat is cut in half. For a given CPU (they are pretty much alike),
it goes from 650456 to 353487.

So could it be that by doing queued wakeups, we end up batching
execution of the woken up tasks for a given CPU, rather than going
back and forth between idle and non-idle ? One important thing that
this changes is to reduce the number of newidle balance triggered.

Thoughts ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com