linux-kernel - Re: [RFC PATCH] sched: Queue task on wakelist in the same llc if the wakee cpu is idle

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20220513063729.GF76023@worktop.programming.kicks-ass.net>
Date:   Fri, 13 May 2022 08:37:29 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Tianchen Ding <dtcccc@...ux.alibaba.com>
Cc:     Ingo Molnar <mingo@...hat.com>, Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] sched: Queue task on wakelist in the same llc if the
 wakee cpu is idle

On Fri, May 13, 2022 at 02:24:27PM +0800, Tianchen Ding wrote:
> We notice the commit 518cd6234178 ("sched: Only queue remote wakeups
> when crossing cache boundaries") disabled queuing tasks on wakelist when
> the cpus share llc. This is because, at that time, the scheduler must
> send IPIs to do ttwu_queue_wakelist.

No; this was because of cache bouncing.

> Nowadays, ttwu_queue_wakelist also
> supports TIF_POLLING, so this is not a problem now when the wakee cpu is
> in idle polling.
> 
> Benefits:
>   Queuing the task on idle cpu can help improving performance on waker cpu
>   and utilization on wakee cpu, and further improve locality because
>   the wakee cpu can handle its own rq. This patch helps improving rt on
>   our real java workloads where wakeup happens frequently.
> 
> Does this patch bring IPI flooding?
>   For archs with TIF_POLLING_NRFLAG (e.g., x86), there will be no
>   difference if the wakee cpu is idle polling. If the wakee cpu is idle
>   but not polling, the later check_preempt_curr() will send IPI too.
> 
>   For archs without TIF_POLLING_NRFLAG (e.g., arm64), the IPI is
>   unavoidable, since the later check_preempt_curr() will send IPI when
>   wakee cpu is idle.
> 
> Benchmark:
> running schbench -m 2 -t 8 on 8269CY:
> 
> without patch:
> Latency percentiles (usec)
>         50.0000th: 10
>         75.0000th: 14
>         90.0000th: 16
>         95.0000th: 16
>         *99.0000th: 17
>         99.5000th: 20
>         99.9000th: 23
>         min=0, max=28
> 
> with patch:
> Latency percentiles (usec)
>         50.0000th: 6
>         75.0000th: 8
>         90.0000th: 9
>         95.0000th: 9
>         *99.0000th: 10
>         99.5000th: 10
>         99.9000th: 14
>         min=0, max=16
> 
> We've also tested unixbench and see about 10% improvement on Pipe-based
> Context Switching, and no performance regression on other test cases.
> 
> For arm64, we've tested schbench and unixbench on Kunpeng920, the
> results show that,

What is a kunpeng and how does it's topology look?

> the improvement is not as obvious as on x86, and
> there's no performance regression.

x86 is wide and varied; what x86 did you test?