linux-kernel - Re: [RFC PATCH 1/1] sched/eevdf: Skip eligibility check for current entity during wakeup preemption

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <79abd35f-1e15-3585-c5dd-1f2841896f4f@amd.com>
Date: Wed, 17 Apr 2024 11:38:20 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: Youssef Esmat <youssefesmat@...omium.org>
Cc: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
 Juri Lelli <juri.lelli@...hat.com>,
 Vincent Guittot <vincent.guittot@...aro.org>,
 Dietmar Eggemann <dietmar.eggemann@....com>,
 Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>,
 Mel Gorman <mgorman@...e.de>, Daniel Bristot de Oliveira
 <bristot@...hat.com>, Valentin Schneider <vschneid@...hat.com>,
 linux-kernel@...r.kernel.org, Tobias Huschle <huschle@...ux.ibm.com>,
 Luis Machado <luis.machado@....com>, Chen Yu <yu.c.chen@...el.com>,
 Abel Wu <wuyun.abel@...edance.com>, Tianchen Ding
 <dtcccc@...ux.alibaba.com>, Xuewen Yan <xuewen.yan94@...il.com>,
 "Gautham R. Shenoy" <gautham.shenoy@....com>
Subject: Re: [RFC PATCH 1/1] sched/eevdf: Skip eligibility check for current
 entity during wakeup preemption

Hello Youssef,

On 3/26/2024 8:36 AM, K Prateek Nayak wrote:
>> [..snip..]
>>
>> Thanks for sharing this Prateek.
>> We actually noticed we could also gain performance by disabling
>> eligibility checks (but disable it on all paths).
>> The following are a few threads we had on the topic:
>>
>> Discussion around eligibility:
>> https://lore.kernel.org/lkml/CA+q576MS0-MV1Oy-eecvmYpvNT3tqxD8syzrpxQ-Zk310hvRbw@mail.gmail.com/
>> Some of our results:
>> https://lore.kernel.org/lkml/CA+q576Mov1jpdfZhPBoy_hiVh3xSWuJjXdP3nS4zfpqfOXtq7Q@mail.gmail.com/
>> Sched feature to disable eligibility:
>> https://lore.kernel.org/lkml/20231013030213.2472697-1-youssefesmat@chromium.org/
>>
> 
> Thank you for pointing me to the discussions. I'll give this a spin on
> my machine and report back what I see. Hope some of it will help during
> the OSPM discussion :)

Sorry about the delay but on a positive note, I do not see any
concerning regressions after dropping the eligibility criteria. I'll
leave the full results from my testing below.
	
o System Details

- 3rd Generation EPYC System
- 2 x 64C/128T
- NPS1 mode

o Kernels

tip:			tip:sched/core at commit 4475cd8bfd9b
			("sched/balancing: Simplify the sg_status
			bitmask and use separate ->overloaded and
			->overutilized flags")

eie: 			(everyone is eligible)
			tip + vruntime_eligible() and entity_eligible()
			always returns true.

o Results

==================================================================
Test          : hackbench
Units         : Normalized time in seconds
Interpretation: Lower is better
Statistic     : AMean
==================================================================
Case:           tip[pct imp](CV)         eie[pct imp](CV)
 1-groups     1.00 [ -0.00]( 1.94)     0.95 [  5.11]( 2.56)
 2-groups     1.00 [ -0.00]( 2.41)     0.97 [  2.80]( 1.52)
 4-groups     1.00 [ -0.00]( 1.16)     0.95 [  5.01]( 1.04)
 8-groups     1.00 [ -0.00]( 1.72)     0.96 [  4.37]( 1.01)
16-groups     1.00 [ -0.00]( 2.16)     0.94 [  5.88]( 2.30)


==================================================================
Test          : tbench
Units         : Normalized throughput
Interpretation: Higher is better
Statistic     : AMean
==================================================================
Clients:    tip[pct imp](CV)         eie[pct imp](CV)
    1     1.00 [  0.00]( 0.69)     1.00 [  0.05]( 0.61)
    2     1.00 [  0.00]( 0.25)     1.00 [  0.06]( 0.51)
    4     1.00 [  0.00]( 1.04)     0.98 [ -1.69]( 1.21)
    8     1.00 [  0.00]( 0.72)     1.00 [ -0.13]( 0.56)
   16     1.00 [  0.00]( 2.40)     1.00 [  0.43]( 0.63)
   32     1.00 [  0.00]( 0.62)     0.98 [ -1.80]( 2.18)
   64     1.00 [  0.00]( 1.19)     0.98 [ -2.13]( 1.26)
  128     1.00 [  0.00]( 0.91)     1.00 [  0.37]( 0.50)
  256     1.00 [  0.00]( 0.52)     1.00 [ -0.11]( 0.21)
  512     1.00 [  0.00]( 0.36)     1.02 [  1.54]( 0.58)
 1024     1.00 [  0.00]( 0.26)     1.01 [  1.21]( 0.41)


==================================================================
Test          : stream-10
Units         : Normalized Bandwidth, MB/s
Interpretation: Higher is better
Statistic     : HMean
==================================================================
Test:       tip[pct imp](CV)         eie[pct imp](CV)
 Copy     1.00 [  0.00]( 5.01)     1.01 [  1.27]( 4.63)
Scale     1.00 [  0.00]( 6.93)     1.03 [  2.66]( 5.20)
  Add     1.00 [  0.00]( 5.94)     1.03 [  3.41]( 4.99)
Triad     1.00 [  0.00]( 6.40)     0.95 [ -4.69]( 8.29)


==================================================================
Test          : stream-100
Units         : Normalized Bandwidth, MB/s
Interpretation: Higher is better
Statistic     : HMean
==================================================================
Test:       tip[pct imp](CV)         eie[pct imp](CV)
 Copy     1.00 [  0.00]( 2.84)     1.00 [ -0.37]( 2.44)
Scale     1.00 [  0.00]( 5.26)     1.00 [  0.21]( 3.88)
  Add     1.00 [  0.00]( 4.98)     1.00 [  0.11]( 1.15)
Triad     1.00 [  0.00]( 1.60)     0.96 [ -3.72]( 5.26)


==================================================================
Test          : netperf
Units         : Normalized Througput
Interpretation: Higher is better
Statistic     : AMean
==================================================================
Clients:         tip[pct imp](CV)         eie[pct imp](CV)
 1-clients     1.00 [  0.00]( 0.90)     1.00 [ -0.09]( 0.16)
 2-clients     1.00 [  0.00]( 0.77)     0.99 [ -0.89]( 0.97)
 4-clients     1.00 [  0.00]( 0.63)     0.99 [ -1.03]( 1.53)
 8-clients     1.00 [  0.00]( 0.52)     0.99 [ -0.86]( 1.66)
16-clients     1.00 [  0.00]( 0.43)     0.99 [ -0.91]( 0.79)
32-clients     1.00 [  0.00]( 0.88)     0.98 [ -2.37]( 1.42)
64-clients     1.00 [  0.00]( 1.63)     0.96 [ -4.07]( 0.91)	*
128-clients    1.00 [  0.00]( 0.94)     1.00 [ -0.30]( 0.94)
256-clients    1.00 [  0.00]( 5.08)     0.95 [ -4.95]( 3.36)
512-clients    1.00 [  0.00](51.89)     0.99 [ -0.93](51.00)

* This seems to be the only point of regression with low CV. I'll
  rerun this and report back if I see a consistent dip but for the
  time being I'm not worried.


==================================================================
Test          : schbench
Units         : Normalized 99th percentile latency in us
Interpretation: Lower is better
Statistic     : Median
==================================================================
#workers: tip[pct imp](CV)         eie[pct imp](CV)
  1     1.00 [ -0.00](30.01)     0.97 [  3.12](14.32)
  2     1.00 [ -0.00](26.14)     1.23 [-22.58](13.48)
  4     1.00 [ -0.00](13.22)     1.00 [ -0.00]( 6.04)
  8     1.00 [ -0.00]( 6.23)     1.00 [ -0.00](13.09)
 16     1.00 [ -0.00]( 3.49)     1.02 [ -1.69]( 3.43)
 32     1.00 [ -0.00]( 2.20)     0.98 [  2.13]( 2.47)
 64     1.00 [ -0.00]( 7.17)     0.88 [ 12.50]( 3.18)
128     1.00 [ -0.00]( 2.79)     1.02 [ -2.46]( 8.29)
256     1.00 [ -0.00](13.02)     1.01 [ -1.34](37.58)
512     1.00 [ -0.00]( 4.27)     0.79 [ 21.49]( 2.41)


==================================================================
Test          : DeathStarBench
Units         : Normalized throughput
Interpretation: Higher is better
Statistic     : Mean
==================================================================
Pinning      scaling     tip                eie (pct imp)
 1CCD           1       1.00            1.15 (%diff: 15.68%)
 2CCD           2       1.00            0.99 (%diff: -1.12%)
 4CCD           4       1.00            1.11 (%diff: 11.65%)
 8CCD           8       1.00            1.05 (%diff:  4.98%)

--

> 
> [..snip..]
>  

I'll try to get data from more workloads, will update the thread with
when it arrives.

--
Thanks and Regards,
Prateek