lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c26251d2-e1bf-e5c7-0636-12ad886e1ea8@amd.com>
Date: Wed, 28 Feb 2024 23:07:37 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: John Stultz <jstultz@...gle.com>
Cc: LKML <linux-kernel@...r.kernel.org>, Joel Fernandes <joelaf@...gle.com>,
 Qais Yousef <qyousef@...gle.com>, Ingo Molnar <mingo@...hat.com>,
 Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>,
 Vincent Guittot <vincent.guittot@...aro.org>,
 Dietmar Eggemann <dietmar.eggemann@....com>,
 Valentin Schneider <vschneid@...hat.com>,
 Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>,
 Zimuzo Ezeozue <zezeozue@...gle.com>, Youssef Esmat
 <youssefesmat@...gle.com>, Mel Gorman <mgorman@...e.de>,
 Daniel Bristot de Oliveira <bristot@...hat.com>,
 Will Deacon <will@...nel.org>, Waiman Long <longman@...hat.com>,
 Boqun Feng <boqun.feng@...il.com>, "Paul E. McKenney" <paulmck@...nel.org>,
 Metin Kaya <Metin.Kaya@....com>, Xuewen Yan <xuewen.yan94@...il.com>,
 Thomas Gleixner <tglx@...utronix.de>, kernel-team@...roid.com
Subject: Re: [RESEND][PATCH v8 0/7] Preparatory changes for Proxy Execution v8

Hello John,

On 2/28/2024 10:54 AM, John Stultz wrote:
> On Tue, Feb 27, 2024 at 9:12 PM K Prateek Nayak <kprateek.nayak@....com> wrote:
>> On 2/28/2024 10:21 AM, John Stultz wrote:
>>> Just to clarify: by "this series" did you test just the 7 preparatory
>>> patches submitted to the list here, or did you pull the full
>>> proxy-exec-v8-6.8-rc3 set from git?
>>
>> Just these preparatory patches for now. On my way to queue a run for the
>> whole set from your tree. I'll use the "proxy-exec-v8-6.8-rc3" branch and
>> pick the commits past the
>> "[ANNOTATION] === Proxy Exec patches past this point ===" till the commit
>> ff90fb583a81 ("FIX: Avoid using possibly uninitialized cpu value with
>> activate_blocked_entities()") on top of the tip:sched/core mentioned
>> above since it'll allow me to reuse the baseline numbers :)
>>
> 
> Ah, thank you for the clarification!
> 
> Also, I really appreciate your testing with the rest of the series as
> well. It will be good to have any potential problems identified early

I got a chance to test the whole of v8 patches on the same dual socket
3rd Generation EPYC system:

tl;dr

- There is a slight regression in hackbench but instead of the 10x
  blowup seen previously, it is only around 5% with overloaded case
  not regressing at all.

- A small but consistent (~2-3%) regression is seen in tbench and
  netperf.

- schbench is inconclusive due to run to run variance and stream is
  perf neutral with proxy execution.

I've not looked deeper into the regressions. I'll let you know if I
spot anything when digging deeper. Below are the full results:

o System Details

- 3rd Generation EPYC System
- 2 x 64C/128T
- NPS1 mode

o Kernels

tip:			tip:sched/core at commit 8cec3dd9e593
			("sched/core: Simplify code by removing
			 duplicate #ifdefs")

proxy-exec-full:	tip + proxy execution commits from
			"proxy-exec-v8-6.8-rc3" described previously in
			this thread.

o Results

==================================================================
Test          : hackbench
Units         : Normalized time in seconds
Interpretation: Lower is better
Statistic     : AMean
==================================================================
Case:           tip[pct imp](CV)    proxy-exec-full[pct imp](CV)
 1-groups     1.00 [ -0.00]( 2.08)     1.00 [ -0.18]( 3.90)
 2-groups     1.00 [ -0.00]( 0.89)     1.04 [ -4.43]( 0.78)
 4-groups     1.00 [ -0.00]( 0.81)     1.05 [ -4.82]( 1.03)
 8-groups     1.00 [ -0.00]( 0.78)     1.02 [ -1.90]( 1.00)
16-groups     1.00 [ -0.00]( 1.60)     1.01 [ -0.80]( 1.18)


==================================================================
Test          : tbench
Units         : Normalized throughput
Interpretation: Higher is better
Statistic     : AMean
==================================================================
Clients:    tip[pct imp](CV)    proxy-exec-full[pct imp](CV)
    1     1.00 [  0.00]( 0.71)     0.97 [ -3.00]( 0.15)
    2     1.00 [  0.00]( 0.25)     0.97 [ -3.35]( 0.98)
    4     1.00 [  0.00]( 0.85)     0.97 [ -3.26]( 1.40)
    8     1.00 [  0.00]( 1.00)     0.97 [ -2.75]( 0.46)
   16     1.00 [  0.00]( 1.25)     0.99 [ -1.27]( 0.11)
   32     1.00 [  0.00]( 0.35)     0.98 [ -2.42]( 0.06)
   64     1.00 [  0.00]( 0.71)     0.97 [ -2.76]( 1.81)
  128     1.00 [  0.00]( 0.46)     0.97 [ -2.67]( 0.88)
  256     1.00 [  0.00]( 0.24)     0.98 [ -1.97]( 0.98)
  512     1.00 [  0.00]( 0.30)     0.98 [ -2.41]( 0.38)
 1024     1.00 [  0.00]( 0.40)     0.98 [ -2.21]( 0.11)


==================================================================
Test          : stream-10
Units         : Normalized Bandwidth, MB/s
Interpretation: Higher is better
Statistic     : HMean
==================================================================
Test:       tip[pct imp](CV)    proxy-exec-full[pct imp](CV)
 Copy     1.00 [  0.00]( 9.73)     1.00 [  0.26]( 6.36)
Scale     1.00 [  0.00]( 5.57)     1.02 [  1.59]( 2.98)
  Add     1.00 [  0.00]( 5.43)     1.00 [  0.48]( 2.77)
Triad     1.00 [  0.00]( 5.50)     0.98 [ -2.18]( 6.06)


==================================================================
Test          : stream-100
Units         : Normalized Bandwidth, MB/s
Interpretation: Higher is better
Statistic     : HMean
==================================================================
Test:       tip[pct imp](CV)    proxy-exec-full[pct imp](CV)
 Copy     1.00 [  0.00]( 3.26)     0.98 [ -1.96]( 3.24)
Scale     1.00 [  0.00]( 1.26)     0.96 [ -3.61]( 6.41)
  Add     1.00 [  0.00]( 1.47)     0.98 [ -1.84]( 4.14)
Triad     1.00 [  0.00]( 1.77)     1.00 [  0.27]( 2.60)


==================================================================
Test          : netperf
Units         : Normalized Througput
Interpretation: Higher is better
Statistic     : AMean
==================================================================
Clients:         tip[pct imp](CV)    proxy-exec-full[pct imp](CV)
 1-clients     1.00 [  0.00]( 0.22)     0.97 [ -3.01]( 0.40)
 2-clients     1.00 [  0.00]( 0.57)     0.97 [ -3.25]( 0.45)
 4-clients     1.00 [  0.00]( 0.43)     0.97 [ -3.26]( 0.59)
 8-clients     1.00 [  0.00]( 0.27)     0.97 [ -2.83]( 0.55)
16-clients     1.00 [  0.00]( 0.46)     0.97 [ -2.99]( 0.65)
32-clients     1.00 [  0.00]( 0.95)     0.97 [ -2.98]( 0.71)
64-clients     1.00 [  0.00]( 1.79)     0.97 [ -2.61]( 1.38)
128-clients    1.00 [  0.00]( 0.89)     0.97 [ -2.72]( 0.94)
256-clients    1.00 [  0.00]( 3.88)     0.98 [ -1.89]( 2.92)
512-clients    1.00 [  0.00](35.06)     0.99 [ -0.78](47.83)


==================================================================
Test          : schbench
Units         : Normalized 99th percentile latency in us
Interpretation: Lower is better
Statistic     : Median
==================================================================
#workers: tip[pct imp](CV)    proxy-exec-full[pct imp](CV)
  1     1.00 [ -0.00](27.28)     1.31 [-31.25]( 6.45)
  2     1.00 [ -0.00]( 3.85)     0.95 [  5.00](10.02)
  4     1.00 [ -0.00](14.00)     1.11 [-10.53]( 1.36)
  8     1.00 [ -0.00]( 4.68)     1.15 [-14.58](14.55)
 16     1.00 [ -0.00]( 4.08)     0.98 [  1.61]( 3.28)
 32     1.00 [ -0.00]( 6.68)     1.02 [ -2.04]( 1.71)
 64     1.00 [ -0.00]( 1.79)     1.12 [-11.73]( 7.08)
128     1.00 [ -0.00]( 6.30)     1.11 [-10.84]( 5.52)
256     1.00 [ -0.00](43.39)     1.37 [-37.14](20.11)
512     1.00 [ -0.00]( 2.26)     0.99 [  1.17]( 1.43)


==================================================================
Test          : Unixbench
Units         : Normalized scores
Interpretation: Lower is better
Statistic     : Various (Mentioned)
==================================================================
Metric	  Variant                    tip        proxy-exec-full
Hmean     unixbench-dhry2reg-1    0.00%           -0.67%
Hmean     unixbench-dhry2reg-512  0.00%            0.14%
Amean     unixbench-syscall-1     0.00%           -0.86%
Amean     unixbench-syscall-512   0.00%           -6.42%
Hmean     unixbench-pipe-1        0.00%            0.79%
Hmean     unixbench-pipe-512      0.00%            0.57%
Hmean     unixbench-spawn-1       0.00%           -3.91%
Hmean     unixbench-spawn-512     0.00%            3.17%
Hmean     unixbench-execl-1       0.00%           -1.18%
Hmean     unixbench-execl-512     0.00%            1.26%
--

> (I'm trying to get v9 ready as soon as I can here, as its fixed a
> number of smaller issues - However, I've also managed to uncover some
> new problems in stress testing, so we'll see how quickly I can chase
> those down).

I haven't seen any splats when running the above tests. I'll test some
larger workloads next. Please let me know if you would like me to test
any specific workload or need additional data from these tests :)

> 
> thanks
> -john
 
--
Thanks and Regards,
Prateek

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ