[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070926084035.0bda3eed@fujitsu-loaner>
Date: Wed, 26 Sep 2007 08:40:35 -0700
From: Stephen Hemminger <shemminger@...ux-foundation.org>
To: Ingo Molnar <mingo@...e.hu>
Cc: David Schwartz <davids@...master.com>,
"Linux-Kernel@...r. Kernel. Org" <linux-kernel@...r.kernel.org>,
Mike Galbraith <efault@....de>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Martin Michlmayr <tbm@...ius.com>,
Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>
Subject: Re: Network slowdown due to CFS
On Wed, 26 Sep 2007 15:31:38 +0200
Ingo Molnar <mingo@...e.hu> wrote:
>
> * David Schwartz <davids@...master.com> wrote:
>
> > > > I think the real fix would be for iperf to use blocking network
> > > > IO though, or maybe to use a POSIX mutex or POSIX semaphores.
> > >
> > > So it's definitely not a bug in the kernel, only in iperf?
> >
> > Martin:
> >
> > Actually, in this case I think iperf is doing the right thing
> > (though not the best thing) and the kernel is doing the wrong
> > thing. [...]
>
> it's not doing the right thing at all. I had a quick look at the
> source code, and the reason for that weird yield usage was that
> there's a locking bug in iperf's "Reporter thread" abstraction and
> apparently instead of fixing the bug it was worked around via a
> horrible yield() based user-space lock.
>
> the (small) patch below fixes the iperf locking bug and removes the
> yield() use. There are numerous immediate benefits of this patch:
>
> - iperf uses _much_ less CPU time. On my Core2Duo test system,
> before the patch it used up 100% CPU time to saturate 1 gigabit of
> network traffic to another box. With the patch applied it now uses 9%
> of CPU time.
>
> - sys_sched_yield() is removed altogether
>
> - i was able to measure much higher bandwidth over localhost for
> example. This is the case for over-the-network measurements as
> well.
>
> - the results are also more consistent and more deterministic, hence
> more reliable as a benchmarking tool. (the reason for that is that
> more CPU time is spent on actually delivering packets, instead of
> mindlessly polling on the user-space "lock", so we actually max out
> the CPU, instead of relying on the random proportion the workload
> was able to make progress versus wasting CPU time on polling.)
>
> sched_yield() is almost always the symptom of broken locking or other
> bug. In that sense CFS does the right thing by exposing such bugs =B-)
>
> Ingo
A similar patch has already been submitted, since BSD wouldn't work
without it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists