linux-kernel - Re: schbench v1.0

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <ZNoeXISDHERwXx5l@chenyu5-mobl2>
Date:   Mon, 14 Aug 2023 20:30:20 +0800
From:   Chen Yu <yu.c.chen@...el.com>
To:     Chris Mason <clm@...a.com>
CC:     Peter Zijlstra <peterz@...radead.org>,
        David Vernet <void@...ifault.com>,
        <linux-kernel@...r.kernel.org>, <kernel-team@...com>,
        Ingo Molnar <mingo@...nel.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        <gautham.shenoy@....com>
Subject: Re: schbench v1.0

Hi Chris,

On 2023-04-21 at 14:14:10 -0400, Chris Mason wrote:
> On 4/20/23 11:05 AM, Peter Zijlstra wrote:
> > On Mon, Apr 17, 2023 at 10:10:25AM +0200, Chris Mason wrote:
> > 
> >> F128 N10                EEVDF    Linus
> >> Wakeup  (usec): 99.0th: 755      1,266
> >> Request (usec): 99.0th: 25,632   22,304
> >> RPS    (count): 50.0th: 4,280    4,376
> >>
> >> F128 N10 no-locking     EEVDF    Linus
> >> Wakeup  (usec): 99.0th: 823      1,118
> >> Request (usec): 99.0th: 17,184   14,192
> >> RPS    (count): 50.0th: 4,440    4,456
> > 
> > With the below fixlet (against queue/sched/eevdf) on my measly IVB-EP
> > (2*10*2):
> > 
> > ./schbench -F128 -n10 -C
> > 
> > Request Latencies percentiles (usec) runtime 30 (s) (153800 total samples)
> > 	  90.0th: 6376       (35699 samples)
> > 	* 99.0th: 6440       (9055 samples)
> > 	  99.9th: 7048       (1345 samples)
> > 
> > CFS
> > 
> > schbench -m2 -F128 -n10	-r90	OTHER	BATCH
> > Wakeup  (usec): 99.0th:		6600	6328
> > Request (usec): 99.0th:		35904	14640
> > RPS    (count): 50.0th:		5368	6104
> > 
> 
> Peter and I went back and forth a bit and now schbench git has a few fixes:
> 
> - README.md updated
> 
> - warmup time defaults to zero (disabling warmup).  This was causing the
> stats inconsistency Peter noticed below.
> 
> - RPS calculated more often.  Every second instead of every reporting
> interval.
> 
> - thread count scaled to CPU count when -m is used.  The thread count is
> per messenge thread, so when you use -m2 like Peter did in these runs,
> he was ending up with 2xNUM_CPUs workers.  That's why his wakeup
> latencies are so high, he had double the work that I did.
> 
> I'll experiment with some of the suggestions he made too.
> 

Sorry for popping up, while we are doing some eevdf tests and encountered
an issue using the latest schbench, we found this thread. It seems that
there is a minor corner case to be dealt with. Could you help take a look
if the following change make sense?

thanks,
Chenyu

>From e84f7634ab611a560a866c887438a4ebd79935ed Mon Sep 17 00:00:00 2001
From: Chen Yu <yu.c.chen@...el.com>
Date: Mon, 14 Aug 2023 05:00:06 -0700
Subject: [PATCH] schbench: fix per-cpu spin lock

On a system with 1 socket offline, the CPU ids might not
be continuous. The per_cpu_locks is allocated based on the
number of online CPUs but not accessed continuously:

CPU(s):                          224
On-line CPU(s) list:             0-55,112-167
Off-line CPU(s) list:            56-111,168-223

The per_cpu_locks is allocated for 112 elements, but be
accessed beyond an index of 112. This could bring unexpected
deadlock during the test.

Fix this by allocating the per_cpu_locks by the number of
possible CPUs, although there could be some waste of space.

Signed-off-by: Chen Yu <yu.c.chen@...el.com>
---
 schbench.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/schbench.c b/schbench.c
index 937f1f2..3eaf1a4 100644
--- a/schbench.c
+++ b/schbench.c
@@ -1359,7 +1359,7 @@ int main(int ac, char **av)
 
 	matrix_size = sqrt(cache_footprint_kb * 1024 / 3 / sizeof(unsigned long));
 
-	num_cpu_locks = get_nprocs();
+	num_cpu_locks = get_nprocs_conf();
 	per_cpu_locks = calloc(num_cpu_locks, sizeof(struct per_cpu_lock));
 	if (!per_cpu_locks) {
 		perror("unable to allocate memory for per cpu locks\n");
-- 
2.25.1

> -chris
>