[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <550CB8D1.9030608@oracle.com>
Date: Fri, 20 Mar 2015 18:18:25 -0600
From: David Ahern <david.ahern@...cle.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
CC: linux-mm <linux-mm@...ck.org>, LKML <linux-kernel@...r.kernel.org>
Subject: Re: 4.0.0-rc4: panic in free_block
On 3/20/15 4:49 PM, David Ahern wrote:
> On 3/20/15 3:17 PM, Linus Torvalds wrote:
>> In other words, if I read that sparc asm right (and it is very likely
>> that I do *not*), then "objp" is NULL, and that's why you crash.
>
> That does appear to be why. I put a WARN_ON before
> clear_obj_pfmemalloc() if objpp[i] is NULL. I got 2 splats during an
> 'allyesconfig' build and the system stayed up.
>
>>
>> That's odd, because we know that objp cannot be NULL in
>> kmem_slab_free() (even if we allowed it, like with kfree(),
>> remove_vma() cannot possibly have a NULL vma, since ti dereferences it
>> multiple times).
>>
>> So I must be misreading this completely. Somebody with better sparc
>> debugging mojo should double-check my logic. How would objp be NULL?
>
> I'll add checks to higher layers and see if it reveals anything.
>
> I did ask around and apparently this bug is hit only with the new M7
> processors. DaveM: that's why you are not hitting this.
Here's another data point: If I disable NUMA I don't see the problem.
Performance drops, but no NULL pointer splats which would have been panics.
The 128 cpu ldom with NUMA enabled shows the problem every single time I
do a kernel compile (-j 128). With NUMA disabled I have done 3
allyesconfig compiles without hitting the problem. I'll put the compiles
into a loop while I head out for dinner.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists