[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110521191418.GA30688@elte.hu>
Date: Sat, 21 May 2011 21:14:18 +0200
From: Ingo Molnar <mingo@...e.hu>
To: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc: linux-kernel@...r.kernel.org, randy.dunlap@...cle.com,
Valdis.Kletnieks@...edu, a.p.zijlstra@...llo.nl
Subject: Re: [GIT PULL rcu/next] fixes and breakup of memory-barrier-decrease
patch
* Paul E. McKenney <paulmck@...ux.vnet.ibm.com> wrote:
> On Sat, May 21, 2011 at 04:28:44PM +0200, Ingo Molnar wrote:
> >
> > * Paul E. McKenney <paulmck@...ux.vnet.ibm.com> wrote:
> >
> > > Hello, Ingo,
> > >
> > > This pull requests covers some RCU bug fixes and one patch rework.
> > >
> > > The first group breaks up the infamous now-reverted (but ultimately
> > > vindicated) "Decrease memory-barrier usage based on semi-formal proof"
> > > commit into five commits. These five commits immediately follow the
> > > revert, and the diff across all six of these commits is empty, so that
> > > the effect of the five commits is to revert the revert.
> >
> > But ... the regression that was observed with that commit needs to be fixed
> > first, or not? In what way was the barrier commit vindicated?
>
> From what I can see, the hang was fixed by Frederic's patch at
> https://lkml.org/lkml/2011/5/19/753. I was interpreting that as vindication,
> perhaps ill-advisedly.
I mean, without Frederic's patch we are getting very long hangs due to the
barrier patch, right?
Even if the barrier patch is not to blame - somehow it still managed to produce
these hangs - and we do not understand it yet.
> Yinghai said that he was still seeing a delay, adn that he was seeing it even
> with the "Decrease memory-barrier usage based on semi-formal proof" reverted:
> https://lkml.org/lkml/2011/5/20/427. This hang seems to happen when he uses
> gcc 4.5.0, but not when using gcc 4.5.1, assuming I understood his sequence
> of emails. So I was interpreting that as meaning that the delay was unlikely
> to be caused by that commit, probably by one of the later commits.
>
> I clearly need to figure out what is causing this delay. I asked Yinghai to
> apply c7a378603 (Remove waitqueue usage for cpu, node, and boost kthreads)
> from Peter Zijlstra because the long delays that Yinghai is seeing (93
> seconds for memory_dev_init() rather than 3 or 4 seconds) might be due to my
> less-efficient method of awakening the RCU kthreads, so that Peter's
> approache might help.
>
> If that doesn't speed things up for Yinghai, then I will work out some
> tracing to help localize the slowdown that he is seeing.
>
> Of course, if you would rather that I get to the bottom of this before
> pulling, fair enough!
We should fix the delay regression i suspect - do we have to revert more stuff
perhaps?
Would it be possible to figure out what caused that other delay for Yinghai?
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists