[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090821150029.GC29542@Krystal>
Date: Fri, 21 Aug 2009 11:00:29 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To: Ingo Molnar <mingo@...e.hu>
Cc: Steven Rostedt <rostedt@...dmis.org>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Josh Triplett <josht@...ux.vnet.ibm.com>,
linux-kernel@...r.kernel.org, laijs@...fujitsu.com,
dipankar@...ibm.com, akpm@...ux-foundation.org, dvhltc@...ibm.com,
niv@...ibm.com, tglx@...utronix.de, peterz@...radead.org,
hugh.dickins@...cali.co.uk, benh@...nel.crashing.org
Subject: Re: [PATCH -tip/core/rcu 1/6] Cleanups and fixes for RCU in face
of heavy CPU-hotplug stress
* Ingo Molnar (mingo@...e.hu) wrote:
>
> * Steven Rostedt <rostedt@...dmis.org> wrote:
>
> > On Fri, 21 Aug 2009, Ingo Molnar wrote:
> >
> > > * Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca> wrote:
> > >
> > > > I would not trust this architecture for synchronization tests.
> > > > There has been reports of a hardware bug affecting the cmpxchg
> > > > instruction in the field. The load fence normally implied by
> > > > the semantic seems to be missing. AFAIK, AMD never
> > > > acknowledged the problem.
> > >
> > > If cmpxchg was broken i'd be having far worse problems and very
> > > widely so.
> >
> > I believe Mathieu is suggesting that the hardware bug is not that
> > the compare and exchange does not work in cmpxchg, but that it
> > does not provide an explicit memory barrier. Such a bug is very
> > hard to trigger, since it requires a race that allows a memory
> > write/read to cross the cmpxchg, and then have this be in such a
> > place that it will cause harm.
>
> We can argue all sorts of exotic hardware bugs really, proof is
> still needed.
>
> [...]
> > > That's not a proof of course (it's near impossible to prove the
> > > lack of a bug), but it's sure a strong indicator and you'll need
> > > to provide far more proof of misbehavior before i discount a
> > > bona fide regression on this box.
> >
> > But with the above said, I totally agree with your point. More
> > proof must be given before we can discount that another bug
> > exists.
>
> Yeah. Especially given that this code was changed recently ;-)
>
Yep, I think we should continue looking for the problem cause, but
stress-testing the hardware with the program I just sent cannot hurt. :)
Mathieu
> Ingo
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists