[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <550DAE23.7030000@oracle.com>
Date: Sat, 21 Mar 2015 11:45:07 -0600
From: David Ahern <david.ahern@...cle.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
CC: linux-mm <linux-mm@...ck.org>, LKML <linux-kernel@...r.kernel.org>
Subject: Re: 4.0.0-rc4: panic in free_block
On 3/20/15 6:47 PM, Linus Torvalds wrote:
>
>> Here's another data point: If I disable NUMA I don't see the problem.
>> Performance drops, but no NULL pointer splats which would have been panics.
>
> So the NUMA case triggers the per-node "n->shared" logic, which
> *should* be protected by "n->list_lock". Maybe there is some bug there
> - but since that code seems to do ok on x86-64 (and apparently older
> sparc too), I really would look at arch-specific issues first.
You raise a lot of valid questions and something to look into. But if
the root cause were such a fundamental issue (CPU memory ordering,
compiler bug, etc) why would it only occur on this one code path -- free
with SLAB and NUMA -- and so consistently?
Continuing to poke around, but open to any suggestions. I have enabled
every DEBUG I can find in the memory code and nothing is popping out. In
terms of races wouldn't all the DEBUG checks affect timing? Yet, I am
still seeing the same stack traces due to the same root cause.
David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists