[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1233223979.5294.41.camel@bugs-laptop>
Date: Thu, 29 Jan 2009 11:12:59 +0100
From: Thomas Pilarski <thomas.pi@...or.de>
To: Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Mike Galbraith <efault@....de>,
Gregory Haskins <ghaskins@...ell.com>,
bugme-daemon@...zilla.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [Bugme-new] [Bug 12562] New: High overhead while switching or
synchronizing threads on different cores
> > There is a regression, because of the improved cpu switching. The
> > problem exists in every kernel.
>
> This is a contradiction in terms - twice.
>
> If it is a regression, then clearly things haven't improved.
>
> If it is a regression, state clearly when it worked last. If it never
> worked, it cannot be a regression.
There is a improvement in load balancing for single threaded
applications. It's a regression for my problem. But the problem exists
in every kernel I have tested.
> > I takes a lot of time to switch between the threads, when they are
> > executed on different cores.
> > Perhaps of the big buffer size of 512KB?
>
> Of course, pushing 512kb to another cpu means lots and lots of cache
> misses.
I have tried 2.6.15, 2.6.18 and 2.6.20 too, but same behavior as in
2.6.24.
With Windows I can get 64 message every second with a buffer size of 512
KB. It is reduced to 16 messages with a buffer size of 1MB. But I think
it not really comparable, because there is nearby no cpu consumption
with 512kB. Perhaps random() works different. By increasing the cpu
usage eight times in the producer, I can get 16msg/s and both cores are
used about ~50%. Doing the same with linux I get a throughput of
~2msg/s.
If it is a caching issue, shouldn't it exists in Windows too?
Using a smaller buffer of 4KB, the test is executed on one core only.
./schedulerissue 1 4096 8 2000
All threads finished: 2000 messages in 1.631 seconds / 1226.076 msg/s
real 0m1.635s
user 0m1.352s
sys 0m0.052s
But I want to use both cores to increase the performance. Adding a
second producer and a second consumer reduces the performance to 33%.
Both cores are used.
./schedulerissue 2 4096 8 2000
All threads finished: 1999 messages in 4.744 seconds / 421.379 msg/s
real 0m4.748s
user 0m3.280s
sys 0m5.852s
I have added a new version as there was a possible deadlock during
shut-down.
View attachment "ThreadSchedulingIssue.c" of type "text/x-csrc" (9411 bytes)
Powered by blists - more mailing lists