[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4756E44E.8080607@jlab.org>
Date: Wed, 05 Dec 2007 12:47:58 -0500
From: Jie Chen <chen@...b.org>
To: Ingo Molnar <mingo@...e.hu>
CC: Simon Holm Th??gersen <odie@...aau.dk>,
Eric Dumazet <dada1@...mosbay.com>,
linux-kernel@...r.kernel.org,
Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: Possible bug from kernel 2.6.22 and above, 2.6.24-rc4
Ingo Molnar wrote:
> * Jie Chen <chen@...b.org> wrote:
>
>>> the moment you saturate the system a bit more, the numbers should
>>> improve even with such a ping-pong test.
>> You are right. If I manually do load balance (bind unrelated processes
>> on the other cores), my test code perform as well as it did in the
>> kernel 2.6.21.
>
> so right now the results dont seem to be too bad to me - the higher
> overhead comes from two threads running on two different cores and
> incurring the overhead of cross-core communications. In a true
> spread-out workloads that synchronize occasionally you'd get the same
> kind of overhead so in fact this behavior is more informative of the
> real overhead i guess. In 2.6.21 the two threads would stick on the same
> core and produce artificially low latency - which would only be true in
> a real spread-out workload if all tasks ran on the same core. (which is
> hardly the thing you want on openmp)
>
I use pthread_setaffinity_np call to bind one thread to one core. Unless
the kernel 2.6.21 does not honor the affinity, I do not see the
difference running two threads on two cores between the new kernel and
the old kernel. My test code does not do any numerical calculation, but
it does spin waiting on shared/non-shared flags. The reason I am using
the affinity is to test synchronization overheads among different cores.
In either the new and the old kernel, I do see 200% CPU usage when I ran
my test code for two threads. Does this mean two threads are running on
two cores? Also I verify a thread is indeed bound to a core by using
pthread_getaffinity_np.
> In any case, if i misinterpreted your numbers or if you just disagree,
> or if have a workload/test that shows worse performance that it
> could/should, let me know.
>
> Ingo
Hi, Ingo:
Since I am using affinity flag to bind each thread to a different core,
the synchronization overhead should increases as the number of
cores/threads increases. But what we observed in the new kernel is the
opposite. The barrier overhead of two threads is 8.93 micro seconds vs
1.86 microseconds for 8 threads (the old kernel is 0.49 vs 1.86). This
will confuse most of people who study the synchronization/communication
scalability. I know my test code is not real-world computation which
usually use up all cores. I hope I have explained myself clearly. Thank
you very much.
--
###############################################
Jie Chen
Scientific Computing Group
Thomas Jefferson National Accelerator Facility
12000, Jefferson Ave.
Newport News, VA 23606
(757)269-5046 (office) (757)269-6248 (fax)
chen@...b.org
###############################################
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists