lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 26 Jun 2023 11:34:41 +0530
From:   "Gautham R. Shenoy" <gautham.shenoy@....com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     David Vernet <void@...ifault.com>, linux-kernel@...r.kernel.org,
        mingo@...hat.com, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, rostedt@...dmis.org,
        dietmar.eggemann@....com, bsegall@...gle.com, mgorman@...e.de,
        bristot@...hat.com, vschneid@...hat.com, joshdon@...gle.com,
        roman.gushchin@...ux.dev, tj@...nel.org, kernel-team@...a.com,
        K Prateek Nayak <kprateek.nayak@....com>
Subject: Re: [RFC PATCH 3/3] sched: Implement shared wakequeue in CFS

Hello Peter, David,

On Fri, Jun 23, 2023 at 03:20:15PM +0530, Gautham R. Shenoy wrote:
> On Thu, Jun 22, 2023 at 12:29:35PM +0200, Peter Zijlstra wrote:
> > On Thu, Jun 22, 2023 at 02:41:57PM +0530, Gautham R. Shenoy wrote:

> 
> I will post more results later.

I was able to get some numbers for hackbench, schbench (old), and
tbench over the weekend on a 2 Socket Zen3 box with 64 cores 128
threads per socket configured in NPS1 mode.

The legend is as follows:

tip : tip/sched/core with HEAD being commit e2a1f85bf9f5 ("sched/psi:
      Avoid resetting the min update period when it is unnecessary")


david : This patchset

david-ego-1 : David's patchset + my modification to allow SIS signal
              that a task should be queued on the shared-wakequeue when SIS cannot
              find an idle CPU to wake up the task.

david-ego-2 : David's patchset + david-ego-1 + my modification to
	      remove the first task from the shared-wakequeue whose
	      cpus_allowed contains this CPU. Currently we don't do
	      this check and always remove the first task. 


david-ego-1 and david-ego-2 are attached with this mail.

hackbench (Measure: time taken to complete, in seconds)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Test:         tip                david                david-ego-1          david-ego-2
1-groups:     3.92 (0.00 pct)    3.35 (14.54 pct)     3.53 (9.94 pct)      3.30 (15.81 pct)
2-groups:     4.58 (0.00 pct)    3.89 (15.06 pct)     3.95 (13.75 pct)     3.79 (17.24 pct)
4-groups:     4.99 (0.00 pct)    4.42 (11.42 pct)     4.76 (4.60 pct)      4.77 (4.40 pct)
8-groups:     5.67 (0.00 pct)    5.08 (10.40 pct)     6.16 (-8.64 pct)     6.33 (-11.64 pct)
16-groups:    7.88 (0.00 pct)    7.32 (7.10 pct)      8.57 (-8.75 pct)     9.77 (-23.98 pct)


Observation: We see that David's patchset does very well across all
the groups.  Expanding the scope of the shared-wakequeue with
david-ego-1 doesn't give us much and in fact hurts at higher
utilization. Same is the case with david-ego-2 which only pulls
allowed tasks from the shared-wakequeue. In david-ego-2 we see a
greater amount of spin-lock contention for 8 and 16 groups, as the
code holds the spinlock and iterates through the list members while
checking cpu-affinity.

So, David's original patchset wins this one.




schbench (Measure : 99th Percentile latency, in us)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#workers: tip                     david                   david-ego-1             david-ego-2
 1:      26.00 (0.00 pct)         21.00 (19.23 pct)       28.00 (-7.69 pct)       22.00 (15.38 pct)
 2:      27.00 (0.00 pct)         29.00 (-7.40 pct)       28.00 (-3.70 pct)       30.00 (-11.11 pct)
 4:      31.00 (0.00 pct)         31.00 (0.00 pct)        31.00 (0.00 pct)        28.00 (9.67 pct)
 8:      36.00 (0.00 pct)         37.00 (-2.77 pct)       34.00 (5.55 pct)        39.00 (-8.33 pct)
16:      49.00 (0.00 pct)         49.00 (0.00 pct)        48.00 (2.04 pct)        50.00 (-2.04 pct)
32:      80.00 (0.00 pct)         80.00 (0.00 pct)        88.00 (-10.00 pct)      79.00 (1.25 pct)
64:     169.00 (0.00 pct)        180.00 (-6.50 pct)      174.00 (-2.95 pct)      168.00 (0.59 pct)
128:     343.00 (0.00 pct)       355.00 (-3.49 pct)      356.00 (-3.79 pct)      344.00 (-0.29 pct)
256:     42048.00 (0.00 pct)   46528.00 (-10.65 pct)   51904.00 (-23.43 pct)   48064.00 (-14.30 pct)
512:     95104.00 (0.00 pct)   95872.00 (-0.80 pct)    95360.00 (-0.26 pct)    97152.00 (-2.15 pct)


Observations: There are run-to-run variations with this benchmark. I
will try with the newer schbench later this week. 

tbench (Measure: Throughput, records/s)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Clients: tip			 sis-node		 david			 david-ego-1		 ego-david-2
    1	 452.49 (0.00 pct)	 457.94 (1.20 pct)	 448.52 (-0.87 pct)	 447.11 (-1.18 pct)	 458.45 (1.31 pct)
    2	 862.44 (0.00 pct)	 879.99 (2.03 pct)	 860.14 (-0.26 pct)	 873.27 (1.25 pct)	 891.72 (3.39 pct)
    4	 1604.27 (0.00 pct)	 1618.87 (0.91 pct)	 1610.95 (0.41 pct)	 1628.45 (1.50 pct)	 1657.26 (3.30 pct)
    8	 2966.77 (0.00 pct)	 3040.90 (2.49 pct)	 2991.07 (0.81 pct)	 3063.31 (3.25 pct)	 3106.50 (4.70 pct)
   16	 5176.70 (0.00 pct)	 5292.29 (2.23 pct)	 5478.32 (5.82 pct)	 5462.05 (5.51 pct)	 5537.15 (6.96 pct)
   32	 8205.24 (0.00 pct)	 8949.12 (9.06 pct)	 9039.63 (10.16 pct)	 9466.07 (15.36 pct)	 9365.06 (14.13 pct)
   64	 13956.71 (0.00 pct)	 14461.42 (3.61 pct)	 16337.65 (17.05 pct)	 16941.63 (21.38 pct)	 15697.47 (12.47 pct)
  128	 24005.50 (0.00 pct)	 26052.75 (8.52 pct)	 25605.24 (6.66 pct)	 27243.19 (13.48 pct)	 24854.60 (3.53 pct)
  256	 32457.61 (0.00 pct)	 21999.41 (-32.22 pct)	 36953.22 (13.85 pct)	 32299.31 (-0.48 pct)	 33037.03 (1.78 pct)
  512	 34345.24 (0.00 pct)	 41166.39 (19.86 pct)	 40845.23 (18.92 pct)	 40797.97 (18.78 pct)	 38150.17 (11.07 pct)
 1024	 33432.92 (0.00 pct)	 40900.84 (22.33 pct)	 39749.35 (18.89 pct)	 41133.82 (23.03 pct)	 38464.26 (15.04 pct)


 Observations: tbench really likes all variants of shared-wakeueue. I
 have also included sis-node numbers since we saw that tbench liked
 sis-node.

Also, it can be noted that except for the 256 clients case (number of
clients == number of threads in the system), in all other cases, we
see a benefit with david-ego-1 which extends the usage of
shared-wakequeue to the waker's target when the waker's LLC is busy.

Will try and get the netperf, postgresql, SPECjbb and Deathstarbench
numbers this week.

--
Thanks and Regards
gautham.







View attachment "david-ego-1.patch" of type "text/x-diff" (2724 bytes)

View attachment "david-ego-2.patch" of type "text/x-diff" (2427 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ