linux-kernel - Re: lmbench ctxsw regression with CFS

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Thu, 2 Aug 2007 04:41:32 +0200
From:	Nick Piggin <npiggin@...e.de>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Ingo Molnar <mingo@...e.hu>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: lmbench ctxsw regression with CFS

On Wed, Aug 01, 2007 at 07:31:26PM -0700, Linus Torvalds wrote:
> 
> 
> On Thu, 2 Aug 2007, Nick Piggin wrote:
> > 
> > lmbench 3 lat_ctx context switching time with 2 processes bound to a
> > single core increases by between 25%-35% on my Core2 system (didn't do
> > enough runs to get more significance, but it is around 30%). The problem
> > bisected to the main CFS commit.
> 
> One thing to check out is whether the lmbench numbers are "correct". 
> Especially on SMP systems, the lmbench numbers are actually *best* when 
> the two processes run on the same CPU, even though that's not really at 
> all the best scheduling - it's just that it artificially improves lmbench 
> numbers because of the close cache affinity for the pipe data structures.

Yes, I bound them to a single core.


> So when running the lmbench scheduling benchmarks on SMP, it actually 
> makes sense to run them *pinned* to one CPU, because then you see the true 
> scheduler performance. Otherwise you easily get noise due to balancing 
> issues, and a clearly better scheduler can in fact generate worse 
> numbers for lmbench.
> 
> Did you do that? It's at least worth testing. I'm not saying it's the case 
> here, but it's one reason why lmbench3 has the option to either keep 
> processes on the same CPU or force them to spread out (and both cases are 
> very interesting for scheduler testing, and tell different things: the 
> "pin them to the same CPU" shows the latency on one runqueue, while the 
> "pin them to different CPU's" shows the latency of a remote wakeup).
> 
> IOW, while we used the lmbench scheduling benchmark pretty extensively in 
> early scheduler tuning, if you select the defaults ("let the system just 
> schedule processes on any CPU") the end result really isn't necessarily a 
> very meaningful value: getting the best lmbench numbers actually requires 
> you to do things that tend to be actively *bad* in real life.
> 
> Of course, a perfect scheduler would notice when two tasks are *so* 
> closely related and only do synchronous wakups, that it would keep them on 
> the same core, and get the best possible scores for lmbench, while not 
> doing that for other real-life situations. So with a *really* smart 
> scheduler, lmbench numbers would always be optimal, but I'm not sure 
> aiming for that kind of perfection is even worth it!

Agreed with all your comments on multiprocessor balancing, but that
was eliminated in these tests. I remote wakeup latency is another thing
I want to test, but it isn't so interesting until the serial regression
is fixed.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/