[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <54CBE23F.3010003@oracle.com>
Date: Fri, 30 Jan 2015 14:57:51 -0500
From: Sasha Levin <sasha.levin@...cle.com>
To: Andy Lutomirski <luto@...capital.net>,
Paul McKenney <paulmck@...ux.vnet.ibm.com>
CC: Borislav Petkov <bp@...en8.de>, X86 ML <x86@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
Oleg Nesterov <oleg@...hat.com>,
Tony Luck <tony.luck@...el.com>,
Andi Kleen <andi@...stfloor.org>,
Josh Triplett <josh@...htriplett.org>,
Frédéric Weisbecker
<fweisbec@...il.com>
Subject: Re: [PATCH v4 2/5] x86, traps: Track entry into and exit from IST
context
On 01/28/2015 04:02 PM, Andy Lutomirski wrote:
> On Wed, Jan 28, 2015 at 9:48 AM, Paul E. McKenney
> <paulmck@...ux.vnet.ibm.com> wrote:
>> On Wed, Jan 28, 2015 at 08:33:06AM -0800, Andy Lutomirski wrote:
>>> On Fri, Jan 23, 2015 at 5:25 PM, Andy Lutomirski <luto@...capital.net> wrote:
>>>> On Fri, Jan 23, 2015 at 12:48 PM, Sasha Levin <sasha.levin@...cle.com> wrote:
>>>>> On 01/23/2015 01:34 PM, Andy Lutomirski wrote:
>>>>>> On Fri, Jan 23, 2015 at 10:04 AM, Borislav Petkov <bp@...en8.de> wrote:
>>>>>>> On Fri, Jan 23, 2015 at 09:58:01AM -0800, Andy Lutomirski wrote:
>>>>>>>>> [ 543.999079] Call Trace:
>>>>>>>>> [ 543.999079] dump_stack (lib/dump_stack.c:52)
>>>>>>>>> [ 543.999079] lockdep_rcu_suspicious (kernel/locking/lockdep.c:4259)
>>>>>>>>> [ 543.999079] atomic_notifier_call_chain (include/linux/rcupdate.h:892 kernel/notifier.c:182 kernel/notifier.c:193)
>>>>>>>>> [ 543.999079] ? atomic_notifier_call_chain (kernel/notifier.c:192)
>>>>>>>>> [ 543.999079] notify_die (kernel/notifier.c:538)
>>>>>>>>> [ 543.999079] ? atomic_notifier_call_chain (kernel/notifier.c:538)
>>>>>>>>> [ 543.999079] ? debug_smp_processor_id (lib/smp_processor_id.c:57)
>>>>>>>>> [ 543.999079] do_debug (arch/x86/kernel/traps.c:652)
>>>>>>>>> [ 543.999079] ? trace_hardirqs_on (kernel/locking/lockdep.c:2609)
>>>>>>>>> [ 543.999079] ? do_int3 (arch/x86/kernel/traps.c:610)
>>>>>>>>> [ 543.999079] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2554 kernel/locking/lockdep.c:2601)
>>>>>>>>> [ 543.999079] debug (arch/x86/kernel/entry_64.S:1310)
>>>>>>>>
>>>>>>>> I don't know how to read this stack trace. Are we in do_int3,
>>>>>>>> do_debug, or both? I didn't change do_debug at all.
>>>>>>>
>>>>>>> It looks like we're in do_debug. do_int3 is only on the stack but not
>>>>>>> part of the current frame if I can trust the '?' ...
>>>>>>>
>>>>>>
>>>>>> It's possible that an int3 happened and I did something wrong on
>>>>>> return that caused a subsequent do_debug to screw up, but I don't see
>>>>>> how my patch would have caused that.
>>>>>>
>>>>>> Were there any earlier log messages?
>>>>>
>>>>> Nope, nothing odd before or after.
>>>>
>>>> Trinity just survived for a decent amount of time for me with my
>>>> patches, other than a bunch of apparently expected OOM kills. I have
>>>> no idea how to tell trinity how much memory to use.
>>>
>>> A longer trinity run on a larger VM survived (still with some OOM
>>> kills, but no taint) with these patches. I suspect that it's a
>>> regression somewhere else in the RCU changes. I have
>>> CONFIG_PROVE_RCU=y, so I should have seen the failure if it was there,
>>> I think.
>>
>> If by "RCU changes" you mean my changes to the RCU infrastructure, I am
>> going to need more of a hint than I see in this thread thus far. ;-)
>>
>
> I can't help much, since I can't reproduce the problem. Presumably if
> it's a bug in -tip, someone else will trigger it, too.
I'm not sure what to tell you here, I'm not using any weird options for trinity
to reproduce it.
It doesn't happen to frequently, but I still see it happening.
Would you like me to try a debug patch or something similar?
Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists