linux-kernel - Re: [patchlet] Re: Epic regression in throughput since v2.6.23

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Thu, 17 Sep 2009 07:06:40 +0200
From:	Mike Galbraith <efault@....de>
To:	Serge Belyshev <belyshev@...ni.sinp.msu.ru>
Cc:	Ingo Molnar <mingo@...e.hu>, linux-kernel@...r.kernel.org,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: [patchlet] Re: Epic regression in throughput since v2.6.23

Aw poo, forgot to add Peter to CC list before poking xmit.

On Thu, 2009-09-17 at 06:55 +0200, Mike Galbraith wrote:
> On Wed, 2009-09-16 at 23:18 +0000, Serge Belyshev wrote:
> > Ingo Molnar <mingo@...e.hu> writes:
> > 
> > > Ok, i think we've got a handle on that finally - mind checking latest 
> > > -tip?
> > 
> > Kernel build benchmark:
> > http://img11.imageshack.us/img11/4544/makej20090916.png
> > 
> > I have also repeated video encode benchmarks described here:
> > http://article.gmane.org/gmane.linux.kernel/889444
> > 
> > "x264 --preset ultrafast":
> > http://img11.imageshack.us/img11/9020/ultrafast20090916.png
> > 
> > "x264 --preset medium":
> > http://img11.imageshack.us/img11/7729/medium20090916.png
> 
> Pre-ramble..
> Most of the performance differences I've examined in all these CFS vs
> BFS threads boil down to fair scheduler vs unfair scheduler.  If you
> favor hogs, naturally, hogs getting more bandwidth perform better than
> hogs getting their fair share.  That's wonderful for hogs, somewhat less
> than wonderful for their competition.  That fairness is not necessarily
> the best thing for throughput is well known.  If you've got a single
> dissimilar task load running alone, favoring hogs may perform better..
> or not.  What about mixed loads though?  Is the throughput of frequent
> switchers less important than hog throughput?
> 
> Moving right along..
> 
> That x264 thing uncovered an interesting issue within CFS.  That load is
> a frequent clone() customer, and when it has to compete against a not so
> fork/clone happy load, it suffers mightily.  Even when running solo, ie
> only competing against it's own siblings, IFF sleeper fairness is
> enabled, the pain of thread startup latency is quite visible.  With
> concurrent loads, it is agonizingly painful.
> 
> concurrent load test
> tbench 8 vs
> x264 --preset ultrafast --no-scenecut --sync-lookahead 0 --qp 20 -o /dev/null --threads 8 soccer_4cif.y4m
> 
> (i can turn knobs and get whatever numbers i want, including
> outperforming bfs, concurrent or solo.. not the point)
> 
> START_DEBIT
> encoded 600 frames, 44.29 fps, 22096.60 kb/s
> encoded 600 frames, 43.59 fps, 22096.60 kb/s
> encoded 600 frames, 43.78 fps, 22096.60 kb/s
> encoded 600 frames, 43.77 fps, 22096.60 kb/s
> encoded 600 frames, 45.67 fps, 22096.60 kb/s
> 
> 8   1068214   672.35 MB/sec  execute  57 sec
> 8   1083785   672.16 MB/sec  execute  58 sec
> 8   1099188   672.18 MB/sec  execute  59 sec
> 8   1114626   672.00 MB/sec  cleanup  60 sec
> 8   1114626   671.96 MB/sec  cleanup  60 sec
> 
> NO_START_DEBIT
> encoded 600 frames, 123.19 fps, 22096.60 kb/s
> encoded 600 frames, 123.85 fps, 22096.60 kb/s
> encoded 600 frames, 120.05 fps, 22096.60 kb/s
> encoded 600 frames, 123.43 fps, 22096.60 kb/s
> encoded 600 frames, 121.27 fps, 22096.60 kb/s
> 
> 8    848135   533.79 MB/sec  execute  57 sec
> 8    860829   534.08 MB/sec  execute  58 sec
> 8    872840   533.74 MB/sec  execute  59 sec
> 8    885036   533.66 MB/sec  cleanup  60 sec
> 8    885036   533.64 MB/sec  cleanup  60 sec
> 
> 2.6.31-bfs221-smp
> encoded 600 frames, 169.00 fps, 22096.60 kb/s
> encoded 600 frames, 163.85 fps, 22096.60 kb/s
> encoded 600 frames, 161.00 fps, 22096.60 kb/s
> encoded 600 frames, 155.57 fps, 22096.60 kb/s
> encoded 600 frames, 162.01 fps, 22096.60 kb/s
> 
> 8    458328   287.67 MB/sec  execute  57 sec
> 8    464442   288.68 MB/sec  execute  58 sec
> 8    471129   288.71 MB/sec  execute  59 sec
> 8    477643   288.61 MB/sec  cleanup  60 sec
> 8    477643   288.60 MB/sec  cleanup  60 sec
> 
> patchlet:
> 
> sched: disable START_DEBIT.
> 
> START_DEBIT induces unfairness to loads which fork/clone frequently when they
> must compete against loads which do not.
> 
> 
> Signed-off-by: Mike Galbraith <efault@....de>
> Cc: Ingo Molnar <mingo@...e.hu>
> Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>
> LKML-Reference: <new-submission>
> 
>  kernel/sched_features.h |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/kernel/sched_features.h b/kernel/sched_features.h
> index d5059fd..2fc94a0 100644
> --- a/kernel/sched_features.h
> +++ b/kernel/sched_features.h
> @@ -23,7 +23,7 @@ SCHED_FEAT(NORMALIZED_SLEEPER, 0)
>   * Place new tasks ahead so that they do not starve already running
>   * tasks
>   */
> -SCHED_FEAT(START_DEBIT, 1)
> +SCHED_FEAT(START_DEBIT, 0)
>  
>  /*
>   * Should wakeups try to preempt running tasks.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/