[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4E279C24.8090309@candelatech.com>
Date: Wed, 20 Jul 2011 20:25:24 -0700
From: Ben Greear <greearb@...delatech.com>
To: paulmck@...ux.vnet.ibm.com
CC: Ingo Molnar <mingo@...e.hu>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Peter Zijlstra <peterz@...radead.org>,
Ed Tomlinson <edt@....ca>, linux-kernel@...r.kernel.org,
laijs@...fujitsu.com, dipankar@...ibm.com,
akpm@...ux-foundation.org, mathieu.desnoyers@...ymtl.ca,
josh@...htriplett.org, niv@...ibm.com, tglx@...utronix.de,
rostedt@...dmis.org, Valdis.Kletnieks@...edu, dhowells@...hat.com,
eric.dumazet@...il.com, darren@...art.com, patches@...aro.org,
edward.tomlinson@...o.bombardier.com
Subject: Re: [PATCH rcu/urgent 0/6] Fixes for RCU/scheduler/irq-threads trainwreck
On 07/20/2011 02:12 PM, Paul E. McKenney wrote:
> On Wed, Jul 20, 2011 at 01:54:49PM -0700, Ben Greear wrote:
>> On 07/20/2011 01:33 PM, Paul E. McKenney wrote:
>>> On Wed, Jul 20, 2011 at 09:57:42PM +0200, Ingo Molnar wrote:
>>>>
>>>> * Ingo Molnar<mingo@...e.hu> wrote:
>>>>
>>>>>
>>>>> * Paul E. McKenney<paulmck@...ux.vnet.ibm.com> wrote:
>>>>>
>>>>>> If my guess is correct, then the minimal non-RCU_BOOST fix is #4
>>>>>> (which drags along #3) and #6. Which are not one-liners, but
>>>>>> somewhat smaller:
>>>>>>
>>>>>> b/kernel/rcutree_plugin.h | 12 ++++++------
>>>>>> b/kernel/softirq.c | 12 ++++++++++--
>>>>>> kernel/rcutree_plugin.h | 31 +++++++++++++++++++++++++------
>>>>>> 3 files changed, 41 insertions(+), 14 deletions(-)
>>>>>
>>>>> That's half the patch size and half the patch count.
>>>>>
>>>>> PeterZ's question is relevant: since we apparently had similar bugs
>>>>> in v2.6.39 as well, what changed in v3.0 that makes them so urgent
>>>>> to fix?
>>>>>
>>>>> If it's just better instrumentation that proves them better then
>>>>> i'd suggest fixing this in v3.1 and not risking v3.0 with an
>>>>> unintended side effect.
>>>>
>>>> Ok, i looked some more at the background and the symptoms that people
>>>> are seeing: kernel crashes and lockups. I think we want these
>>>> problems fixed in v3.0, even if it was the recent introduction of
>>>> RCU_BOOST that made it really prominent.
>>>>
>>>> Having put some testing into your rcu/urgent branch today i'd feel
>>>> more comfortable with taking this plus perhaps an RCU_BOOST disabling
>>>> patch. That makes it all fundamentally tested by a number of people
>>>> (including those who reported/reproduced the problems).
>>>
>>> RCU_BOOST is currently default=n. Is that sufficient? If not, one
>>
>> Not if it remains broken I think..unless you put it under CONFIG_BROKEN
>> or something. Otherwise, folks are liable to turn it on and not realize
>> it's the cause of subtle bugs.
>
> Good point, I could easily add "depends on BROKEN".
>
>> For what it's worth, my tests have been running clean for around 2 hours, so the full set of
>> fixes with RCU_BOOST appears good, so far. I'll let it continue to run
>> at least overnight to make sure I'm not just getting lucky...
>
> Continuing to think good thoughts... ;-)
My test is still going strong with no splats or errors, so I think that
nailed the problems I was seeing...
Thanks,
Ben
>
> Thanx, Paul
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
Ben Greear <greearb@...delatech.com>
Candela Technologies Inc http://www.candelatech.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists