[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4D243B1B.9060803@redhat.com>
Date: Wed, 05 Jan 2011 11:34:19 +0200
From: Avi Kivity <avi@...hat.com>
To: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
CC: Rik van Riel <riel@...hat.com>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org,
Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Mike Galbraith <efault@....de>,
Chris Wright <chrisw@...s-sol.org>
Subject: Re: [RFC -v3 PATCH 2/3] sched: add yield_to function
On 01/05/2011 11:30 AM, KOSAKI Motohiro wrote:
> > On 01/05/2011 10:40 AM, KOSAKI Motohiro wrote:
> > > > On 01/05/2011 04:39 AM, KOSAKI Motohiro wrote:
> > > > > > On 01/04/2011 08:14 AM, KOSAKI Motohiro wrote:
> > > > > > > Also, If pthread_cond_signal() call sys_yield_to imlicitly, we can
> > > > > > > avoid almost Nehalem (and other P2P cache arch) lock unfairness
> > > > > > > problem. (probaby creating pthread_condattr_setautoyield_np or similar
> > > > > > > knob is good one)
> > > > > >
> > > > > > Often, the thread calling pthread_cond_signal() wants to continue
> > > > > > executing, not yield.
> > > > >
> > > > > Then, it doesn't work.
> > > > >
> > > > > After calling pthread_cond_signal(), T1 which cond_signal caller and T2
> > > > > which waked start to GIL grab race. But usually T1 is always win because
> > > > > lock variable is in T1's cpu cache. Why kernel and userland have so much
> > > > > different result? One of a reason is glibc doesn't have any ticket lock scheme.
> > > > >
> > > > > If you are interesting GIL mess and issue, please feel free to ask more.
> > > >
> > > > I suggest looking into an explicit round-robin scheme, where each thread
> > > > adds itself to a queue and an unlock wakes up the first waiter.
> > >
> > > I'm sure you haven't try your scheme. but I did. It's slow.
> >
> > Won't anything with a heavily contented global/giant lock be slow?
> > What's the average lock hold time per thread? 10%? 50%? 90%?
>
> Well, Of cource all of heavily contetion are slow. but we don't have to
> compare heavily contended with light contended. we have to compare
> heavily contended with heavily contended or light contended with light
> contended. If we are talking a scripting language VM, pipe benchmark
> show impressively FIFO overhead which like your propsed. Because
> pipe bench makes frequently GIL grab/ungrab storm. Similar to pipe
> bench showed our (very) old kernel's bottleneck. Sadly userspace have
> no way to implement per-cpu runqueue. I think.
A completely fair lock will likely be slower than an unfair lock.
> And, if we are talking a language VM, I can't say any average time. It
> depend on running script.
Pick some parallel compute intensive script, please.
--
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists