lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d05abb37e19055d249b720c1ac448734fc6ea84f.camel@gmx.de>
Date: Tue, 02 Jul 2024 07:08:35 +0200
From: Mike Galbraith <efault@....de>
To: Chen Yu <yu.c.chen@...el.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, 
 Juri Lelli <juri.lelli@...hat.com>, Vincent Guittot
 <vincent.guittot@...aro.org>, Tim Chen <tim.c.chen@...el.com>, Yujie Liu
 <yujie.liu@...el.com>, K Prateek Nayak <kprateek.nayak@....com>, "Gautham R
 . Shenoy" <gautham.shenoy@....com>,  Chen Yu <yu.chen.surf@...il.com>,
 linux-kernel@...r.kernel.org, Raghavendra K T <raghavendra.kt@....com>
Subject: Re: [PATCH 1/2] sched/fair: Record the average duration of a task

On Mon, 2024-07-01 at 22:57 +0800, Chen Yu wrote:
> > Just take a look at the high speed ping-pong thing (not a benchmark,
> > that's a box full of tape measures, rather silly, but..).  TCP_RR IS
> > 1:1, has as short a duration as network stack plus scheduler can
> > possibly make it, and is nearly synchronous to boot, two halves of a
> > whole, the ONLY thing you can certainly safely stack..
>
> I agree, this is a limited scenario.
>
> > but a shared L2 box still takes a wee hit when you do so.
>
> According to a test conducted last month on a system with 500+ CPUs where 4 CPUs
> share the same L2 cache, around 20% improvement was noticed (though not as much
> as on the non-L2 shared platform).

This dinky box doesn't have 500 cores, but it's.. aw, adorable :)

rpi4:/root # ONLY=TCP_RR netperf.sh
TCP_RR-1        unbound    Avg:  31754  Sum:    31754
TCP_RR-1        stacked    Avg:  26625  Sum:    26625
TCP_RR-1        cross-core Avg:  32325  Sum:    32325

rpi4:/root # tbench.sh 1 30 2>&1|grep Throughput
Throughput 139.024 MB/sec  1 clients  1 procs  max_latency=1.116 ms
rpi4:/root # taskset -c 3 tbench.sh 1 30 2>&1|grep Throughput
Throughput 116.765 MB/sec  1 clients  1 procs  max_latency=0.340 ms
rpi4:/root #

This little box running its stock 6.6.33 distro kernel pulls out a
cross-core win for both maximally synchronous TCP_RR and the a bit
lesser so but still pretty close tbench.  The numbers mean little
though, one propagation speed is lovely, but were there more, I'd be as
stuck with them as I am with rpi4's one-speed (all ahead slow) gearbox.

	-Mike

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ